Variational algorithms for Bayesian inference in latent Gaussian models
In case you object to the disclosure of your thesis, you can contact firstname.lastname@example.org
[S.l. : s.n.]
Number of pages
Promotor : Hoop, H. de Co-promotor : Zwarts, J. Co-promotor : Heskes, T.M.
Display more detailsDisplay less details
The results in this thesis are based on applications of the expectation propagation algorithm to approximate marginals in models with Gaussian prior densities. A short introduction of variational methods in Bayesian inference is given in Chapter 1. In Chapter 2, we start out with a model where both the prior and the likelihood is Gaussian and study the properties of the message passing algorithm and the corresponding Bethe free energy. It turns out that although in terms of functional parameters qk and qij the free energy has the same property as in the discrete case, when expressing it in a parametric form that incorporates the marginal consistency constraints (as in (1.3)), its behavior is quite surprising. While in the discrete case the free energy is a bounded func- tion, in the Gaussian case it can be unbounded when expressed in terms of the moment parameters of the approximate marginals. The typical relaxations applied in the discrete case (e.g. Wainwright et al., 2003; Wiegerinck and Heskes, 2003) seem to achieve the opposite effect by creating a convex objective with an unbounded global minimum. We show that the stable fixed points of the Gaussian message passing algorithm are local minima of the Gaussian free energy and that both the convergence of the message passing algorithm and the existence of local minima is more likely for relaxation parameters that move the free energy closer to the mean field free energy. We also give sufficient and necessary conditions for the boundedness of the Gaussian Bethe free energy. In Chapter 3, we address the problem of approximating posterior marginals in models where p(x|M) is a Gaussian prior and the non-Gaussian likelihood factorizes into a product of terms depending on one variable only. The methods we propose are not restricted to these models, but they are particularly well-suited for them. The approximate posterior marginals in these models are typically computed by approximating the non-Gaussian posterior density with a multivariate Gaussian density either using the variational objective or by expectation propagation. The Gaussian marginals are then used as approximations of the posterior marginals. In Chapter 3 we go beyond these Gaussian marginal approximations and we derive a framework to improve on the Gaussian marginal approximations. The improved marginals seem to perform well in the comfort zone of EP, that is, in mod- els where the posterior density is log-concave. Although we do not provide an estimate or an upper bound on the error, the approximations have the nice property that they can be gradually improved whenever better accuracy is needed. In Chapter 4, we define a multivariate scale mixture distribution that can be used as a sparsifying prior in the context of linear regression and logistic regression. We derive an efficient expectation propagation algorithm to do approximate inference in these models. We use these models to do approximate Bayesian inference for assessing the activation of brain areas in task-related MEG and fMRI experiments. The multivariate prior we in- troduce is based on the scale mixture representation of the univariate double exponential prior and it is defined with the aim to introduce prior correlations between the magni- tudes of the regression coefficients. This was motivated by the observation that in many MEG and fMRI applications the activations have smooth spatial and temporal patterns, that is, neighboring brain areas (in space, in time, or both) are likely to have similar ac- tivation levels. The prior keeps the regression coefficients a priori uncorrelated, but it correlates their magnitudes. The symmetry properties of the prior lead to posterior den- sities that imply block diagonal correlation structures. The approximating multivariate Gaussians inherit this property. The block diagonal covariance structure and the typically underdetermined regression models make the computational complexity of EP to scale linearly with the number of regression coefficients. We show that the importance maps created from the approximate posterior moments of the scale parameters are meaningful and neuro-biologically reasonable. Proefschrift Dissertation
Upload full text
Use your RU credentials (u/z-number and password) tolog in with SURFconextto upload a file for processing by the repository team.