Topics in sparse Bayesian machine learning

Wang, Liwei

Topics in sparse Bayesian machine learning

Files

Wang_bu_0017E_18328.pdf(2.13 MB)

Date

2023

Authors

Wang, Liwei

URI

https://hdl.handle.net/2144/49712

Abstract

This dissertation is devoted to addressing several challenging problems in machine learning via the Bayesian approach. One popular approach to Bayesian deep learning is to use Monte Carlo methods, such as Markov Chain Monte Carlo (MCMC), to approximate the posterior distribution. These methods generate a set of samples from the posterior, which can be used to quantify the uncertainty in the parameters and make probabilistic predictions. Bayesian methods in deep learning provide a framework for incorporating uncertainty into the learning process and can lead to more robust models with improved performance on unseen data. They have been applied to a wide range of problems, including image classification, reinforcement learning, and generative models, among others. This dissertation is organized as follows. First chapter is fast asynchronous sampler in sparse bayesian learning. In this chapter, We propose a very fast approximate Markov Chain Monte Carlo(MCMC) sampling framework that is applicable to a large class of sparse Bayesian inference problems, where the computational cost per iteration in several regression models is of order O(n(s + J)), where n is the sample size, s the underlying sparsity of the model, and J is the size of a randomly selected subset of regressors. This cost can be further reduced by data sub-sampling when stochastic gradient Langevin dynamics are employed. The algorithm is an extension of the asynchronous Gibbs sampler of Johnson et al. (2013), but can be viewed from a statistical perspective as a form of Bayesian iterated sure independent screening (Fan et al. (2009)). We show that in high-dimensional linear regression problems, the Markov chain generated by the proposed algorithm admits an invariant distribution that recovers correctly the main signal with high probability under some statistical assumptions. Furthermore we show that its mixing time is at most linear in the number of regressors. We illustrate the algorithm with several models. Second chapter is A one-step Laplace Approximation for high-dimensional variable selection. In this chapter, we introduce a rapid one-step Laplace approximation method, referred to as OLAP, which effectively tackles the computational burden of variable selection in high dimensions. Our findings demonstrate that this approximation offers a consistent variable selection procedure under reasonable assumptions. Additionally, we establish that the mixing time of the Gibbs sampler, employed for sampling from the posterior distribution of OLAP, scales linearly with the dimension p. Through comprehensive simulations, we validate the efficiency and accuracy of our proposed sampler, highlighting its potential to significantly enhance variable selection processes. Third chapter is Sparse(Cyclical) MCMC in Deep Neural Networks. In this chapter, we propose a general cyclical MCMC framework for a class of Bayesian inference problem, aiming to generate samples from one single mode in each cycle andhave mode swapping among different cycles to capture multimodality. We provide extensive results on the performance of prediction, multimodality of different cyclical MCMC methods on high-dimensional gaussian mixture models. We then introduce the sparse cyclical MCMC sampler in deep neural networks and present promising simulation results from the perspective of uncertainty estimation and calibration.

Description

2023

Collections

Boston University Theses & Dissertations

Full item page