Two studies in resource-efficient inference: structural testing of networks, and selective classification
OA Version
Citation
Abstract
Inference systems suffer costs arising from information acquisition, and from communication and computational costs of executing complex models. This dissertation proposes, in two distinct themes, systems-level methods to reduce these costs without affecting the accuracy of inference by using ancillary low-cost methods to cheaply address most queries, while only using resource-heavy methods on 'difficult' instances.
The first theme concerns testing methods in structural inference of networks and graphical models, the proposal being that one first cheaply tests whether the structure underlying a dataset differs from a reference structure, and only estimates the new structure if this difference is large. This study focuses on theoretically establishing separations between the costs of testing and learning to determine when a strategy such as the above has benefits. For two canonical models---the Ising model, and the stochastic block model---fundamental limits are derived on the costs of one- and two-sample goodness-of-fit tests by determining information-theoretic lower bounds, and developing matching tests. A biphasic behaviour in the costs of testing is demonstrated: there is a critical size scale such that detection of differences smaller than this size is nearly as expensive as recovering the structure, while detection of larger differences has vanishing costs relative to recovery.
The second theme concerns using Selective classification (SC), or classification with an option to abstain, to control inference-time costs in the machine learning framework. The proposal is to learn a low-complexity selective classifier that only abstains on hard instances, and to execute more expensive methods upon abstention. Herein, a novel SC formulation with a focus on high-accuracy is developed, and used to obtain both theoretical characterisations, and a scheme for learning selective classifiers based on optimising a collection of class-wise decoupled one-sided risks. This scheme attains strong empirical performance, and admits efficient implementation, leading to an effective SC methodology. Finally, SC is studied in the online learning setting with feedback only provided upon abstention, modelling the practical lack of reliable labels without expensive feature collection, and a Pareto-optimal low-error scheme is described.
Description
License
Attribution-ShareAlike 4.0 International