良性自动编码器

论文标题

良性自动编码器

Benign Autoencoders

论文作者

Malamud, Semyon, Xu, Teng Andrea, Didisheim, Antoine

论文摘要

生成人工智能（AI）的最新进展取决于有效的数据表示，通常具有编码器架构。我们正式化了查找最佳编码器对并表征其解决方案的数学问题，我们将其命名为“ Banign AutoCododer”（BAE）。我们证明BAE将数据投射到一个歧管上，其维度是生成问题的最佳可压缩性维度。我们重点介绍了BAE与AI的最新发展之间的令人惊讶的联系，例如有条件的gan，上下文编码，稳定的扩散，堆叠的自动编码器以及生成模型的学习能力。作为例证，我们展示了BAE如何找到最佳，低维的潜在表示，从而改善分布变化下的鉴别器的性能。通过压缩“恶性”数据维度，BAE会导致更光滑，更稳定的梯度。

Recent progress in Generative Artificial Intelligence (AI) relies on efficient data representations, often featuring encoder-decoder architectures. We formalize the mathematical problem of finding the optimal encoder-decoder pair and characterize its solution, which we name the "benign autoencoder" (BAE). We prove that BAE projects data onto a manifold whose dimension is the optimal compressibility dimension of the generative problem. We highlight surprising connections between BAE and several recent developments in AI, such as conditional GANs, context encoders, stable diffusion, stacked autoencoders, and the learning capabilities of generative models. As an illustration, we show how BAE can find optimal, low-dimensional latent representations that improve the performance of a discriminator under a distribution shift. By compressing "malignant" data dimensions, BAE leads to smoother and more stable gradients.

下载PDF全文

下载文献需遵守相关版权规定

论文标题