论文标题
哈密顿蒙特卡洛使用伴随分化的拉普拉斯近似:贝叶斯推断潜在高斯型号及以后
Hamiltonian Monte Carlo using an adjoint-differentiated Laplace approximation: Bayesian inference for latent Gaussian models and beyond
论文作者
论文摘要
高斯潜在变量模型是贝叶斯分层模型的关键类别,并在许多领域中进行了应用。在马尔可夫链蒙特卡洛算法上与所得后验分布的几何形状斗争,对这种模型进行贝叶斯推断可能会具有挑战性。一种替代方法是使用拉普拉斯近似值将潜在的高斯变量边缘化,然后使用动态的哈密顿蒙特卡洛(Monte Carlo),这是一种基于梯度的马尔可夫链蒙特卡洛采样器,然后使用动态的哈密顿蒙特卡洛(Monte Carlo)整合了其余的超参数。为了有效地实施此方案,我们得出了一种新颖的伴随方法,该方法传播了构建近似边缘可能性梯度所需的最小信息。该策略产生了一种可扩展的分化方法,该方法比高尺寸高时的数量级要比艺术区分技术的状态快。我们在概率编程框架Stan中原型原型进行了方法,并测试了嵌入式拉普拉斯近似值的效用,其中包括一种模型,其中超参数的尺寸为$ \ sim $ 6,000。根据案例,好处可以包括减轻挫败哈密顿蒙特卡洛和戏剧性加速的几何病理。
Gaussian latent variable models are a key class of Bayesian hierarchical models with applications in many fields. Performing Bayesian inference on such models can be challenging as Markov chain Monte Carlo algorithms struggle with the geometry of the resulting posterior distribution and can be prohibitively slow. An alternative is to use a Laplace approximation to marginalize out the latent Gaussian variables and then integrate out the remaining hyperparameters using dynamic Hamiltonian Monte Carlo, a gradient-based Markov chain Monte Carlo sampler. To implement this scheme efficiently, we derive a novel adjoint method that propagates the minimal information needed to construct the gradient of the approximate marginal likelihood. This strategy yields a scalable differentiation method that is orders of magnitude faster than state of the art differentiation techniques when the hyperparameters are high dimensional. We prototype the method in the probabilistic programming framework Stan and test the utility of the embedded Laplace approximation on several models, including one where the dimension of the hyperparameter is $\sim$6,000. Depending on the cases, the benefits can include an alleviation of the geometric pathologies that frustrate Hamiltonian Monte Carlo and a dramatic speed-up.