以F差异最小化训练深层基于能量的模型

论文标题

以F差异最小化训练深层基于能量的模型

Training Deep Energy-Based Models with f-Divergence Minimization

论文作者

Yu, Lantao, Song, Yang, Song, Jiaming, Ermon, Stefano

论文摘要

基于深度能量的模型（EBM）在分布参数化方面非常灵活，但由于具有棘手的分区函数，因此在计算方面具有挑战性。它们通常是通过最大似然训练，使用对比差异来近似数据和模型分布之间的KL差异的梯度。尽管KL Divergence具有许多理想的属性，但其他F-Diverences在训练隐式密度生成模型（例如生成对抗网络）中显示出优势。在本文中，我们提出了一个称为F-EBM的一般变异框架，使用任何所需的F-Divergence训练EBM。我们引入了相应的优化算法，并证明了其局部收敛性具有非线性动力学系统理论。实验结果表明，使用KL以外的F-EBM的F-EBM优于对比的差异以及训练EBM的优势。

Deep energy-based models (EBMs) are very flexible in distribution parametrization but computationally challenging because of the intractable partition function. They are typically trained via maximum likelihood, using contrastive divergence to approximate the gradient of the KL divergence between data and model distribution. While KL divergence has many desirable properties, other f-divergences have shown advantages in training implicit density generative models such as generative adversarial networks. In this paper, we propose a general variational framework termed f-EBM to train EBMs using any desired f-divergence. We introduce a corresponding optimization algorithm and prove its local convergence property with non-linear dynamical systems theory. Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题