Langevin自动编码器，用于学习深层变量模型

论文标题

Langevin自动编码器，用于学习深层变量模型

Langevin Autoencoders for Learning Deep Latent Variable Models

论文作者

Taniguchi, Shohei, Iwasawa, Yusuke, Kumagai, Wataru, Matsuo, Yutaka

论文摘要

马尔可夫链蒙特卡洛（MCMC），例如langevin Dynamics，有效地近似棘手的分布。但是，由于昂贵的数据采样迭代和缓慢的收敛性，其使用在深层可变模型的背景下受到限制。本文提出了摊销的langevin Dynamics（ALD），其中数据划分的MCMC迭代完全被编码器的更新替换为将观测值映射到潜在变量中。这种摊销可以实现有效的后验采样，而无需数据迭代。尽管具有效率，但我们证明ALD是MCMC算法有效的，其马尔可夫链在轻度假设下将目标后部作为固定分布。基于ALD，我们还提出了一个名为Langevin AutoCoder（LAE）的新的深层变量模型。有趣的是，可以通过稍微修改传统自动编码器来实现LAE。使用多个合成数据集，我们首先验证ALD可以从目标后代正确获取样品。我们还在图像生成任务上评估了LAE，并证明我们的LAE可以根据变异推断（例如变异自动编码器）和其他基于MCMC的方法在测试可能性方面胜过现有的方法。

Markov chain Monte Carlo (MCMC), such as Langevin dynamics, is valid for approximating intractable distributions. However, its usage is limited in the context of deep latent variable models owing to costly datapoint-wise sampling iterations and slow convergence. This paper proposes the amortized Langevin dynamics (ALD), wherein datapoint-wise MCMC iterations are entirely replaced with updates of an encoder that maps observations into latent variables. This amortization enables efficient posterior sampling without datapoint-wise iterations. Despite its efficiency, we prove that ALD is valid as an MCMC algorithm, whose Markov chain has the target posterior as a stationary distribution under mild assumptions. Based on the ALD, we also present a new deep latent variable model named the Langevin autoencoder (LAE). Interestingly, the LAE can be implemented by slightly modifying the traditional autoencoder. Using multiple synthetic datasets, we first validate that ALD can properly obtain samples from target posteriors. We also evaluate the LAE on the image generation task, and show that our LAE can outperform existing methods based on variational inference, such as the variational autoencoder, and other MCMC-based methods in terms of the test likelihood.

下载PDF全文

下载文献需遵守相关版权规定

论文标题