论文标题
ControlVae:调整,分析特性和性能分析
ControlVAE: Tuning, Analytical Properties, and Performance Analysis
论文作者
论文摘要
本文回顾了可控的变异自动编码器(ControlVAE)的新颖概念,讨论了其参数调整以满足应用程序需求,得出其关键分析属性,并提供有用的扩展和应用程序。 ControlVAE是一种新的变分自动编码器(VAE)框架,将自动控制理论与基本VAE结合在一起,以将VAE模型的KL差异稳定到指定值中。它利用非线性PI控制器,这是比例综合衍生(PID)控制的变体,以使用输出KL-Divergence作为反馈作为下限(ELBO)在证据下限(ELBO)中动态调整KL差异项的重量。这使我们能够精确地控制KL差异到所需的值(设定点),该值有效地避免后塌陷和学习分离表示。为了改善ELBO在常规VAE上,我们提供了简化的理论分析,以告知KL-Divergence的设定Controlvae的设定点。我们观察到,与试图在VAE目标中平衡两个术语的其他方法相比,ControlVae会导致更好的学习动态。特别是,它可以在重建质量和KL-Divergence之间取决于良好的权衡。我们在三个任务上评估了提出的方法:图像产生,语言建模和分离表示表示学习。结果表明,控制节可以比其他可比脱节的方法获得更好的重建质量。在语言建模任务上,ControlVae可以避免后倒塌(KL消失)并改善生成的文本的多样性。此外,我们的方法可以改变优化轨迹,改善ELBO和图像生成的重建质量。
This paper reviews the novel concept of controllable variational autoencoder (ControlVAE), discusses its parameter tuning to meet application needs, derives its key analytic properties, and offers useful extensions and applications. ControlVAE is a new variational autoencoder (VAE) framework that combines the automatic control theory with the basic VAE to stabilize the KL-divergence of VAE models to a specified value. It leverages a non-linear PI controller, a variant of the proportional-integral-derivative (PID) control, to dynamically tune the weight of the KL-divergence term in the evidence lower bound (ELBO) using the output KL-divergence as feedback. This allows us to precisely control the KL-divergence to a desired value (set point), which is effective in avoiding posterior collapse and learning disentangled representations. In order to improve the ELBO over the regular VAE, we provide simplified theoretical analysis to inform setting the set point of KL-divergence for ControlVAE. We observe that compared to other methods that seek to balance the two terms in VAE's objective, ControlVAE leads to better learning dynamics. In particular, it can achieve a good trade-off between reconstruction quality and KL-divergence. We evaluate the proposed method on three tasks: image generation, language modeling and disentangled representation learning. The results show that ControlVAE can achieve much better reconstruction quality than the other methods for comparable disentanglement. On the language modeling task, ControlVAE can avoid posterior collapse (KL vanishing) and improve the diversity of generated text. Moreover, our method can change the optimization trajectory, improving the ELBO and the reconstruction quality for image generation.