Coresets for scalable Bayesian logistic regression

Duration: 48 mins 11 secs
Share this media item:
Embed this media item:


About this item
Image inherited from collection
Description: Broderick, T
Tuesday 4th July 2017 - 13:30 to 14:15
 
Created: 2017-07-21 14:04
Collection: Scalable inference; statistical, algorithmic, computational aspects
Publisher: Isaac Newton Institute
Copyright: Broderick, T
Language: eng (English)
 
Abstract: Co-authors: Jonathan H. Huggins (MIT), Trevor Campbell (MIT)

The use of Bayesian methods in large-scale data settings is attractive because of the rich hierarchical models, uncertainty quantification, and prior specification they provide. However, standard Bayesian inference algorithms are computationally expensive, so their direct application to large datasets can be difficult or infeasible. Rather than modify existing algorithms, we instead leverage the insight that data is often redundant via a pre-processing step. In particular, we construct a weighted subset of the data (called a coreset) that is much smaller than the original dataset. We then input this small coreset to existing posterior inference algorithms without modification. To demonstrate the feasibility of this approach, we develop an efficient coreset construction algorithm for Bayesian logistic regression models. We provide theoretical guarantees on the size and approximation quality of the coreset -- both for fixed, known datasets, and in expectation for a wide class o f data generative models. Our approach permits efficient construction of the coreset in both streaming and parallel settings, with minimal additional effort. We demonstrate the efficacy of our approach on a number of synthetic and real-world datasets, and find that, in practice, the size of the coreset is independent of the original dataset size.
Available Formats
Format Quality Bitrate Size
MPEG-4 Video 640x360    1.93 Mbits/sec 699.37 MB View Download
WebM 640x360    523.16 kbits/sec 184.69 MB View Download
iPod Video 480x270    521.73 kbits/sec 184.12 MB View Download
MP3 44100 Hz 249.77 kbits/sec 88.24 MB Listen Download
Auto * (Allows browser to choose a format it supports)