EM的同型不变性：通过镜下下降指数族的KL差异的非反应收敛

论文标题

EM的同型不变性：通过镜下下降指数族的KL差异的非反应收敛

Homeomorphic-Invariance of EM: Non-Asymptotic Convergence in KL Divergence for Exponential Families via Mirror Descent

论文作者

Kunstner, Frederik, Kumar, Raunak, Schmidt, Mark

论文摘要

期望最大化（EM）是用于拟合具有缺失或潜在变量的概率模型的默认算法，但是我们对其非质合会收敛属性缺乏完全的理解。先前的工作显示了通过假设梯度下降的收敛条件适用于EM的条件，显示了“ EM收敛至少与梯度下降一样快”的结果。这种方法不仅松散，因为它不能捕获EM可以比梯度步骤取得更大的进步，而且假设无法为像高斯混合物（例如高斯混合物）的教科书示例。在这项工作中，我们首先表明，对于指数式的家庭分布的共同环境，将EM视为镜下血统算法会导致Kullback-Leibler（KL）差异的收敛速率。然后，我们展示了KL Divergence如何通过Bregman Diverences与一阶平稳性有关。与以前的工作相反，该分析对于参数化的选择是不变的，并且具有最小的假设。我们还显示了这些想法在本地线性（和超级线性）收敛速率，广义EM和非指数家庭分布中的应用。

Expectation maximization (EM) is the default algorithm for fitting probabilistic models with missing or latent variables, yet we lack a full understanding of its non-asymptotic convergence properties. Previous works show results along the lines of "EM converges at least as fast as gradient descent" by assuming the conditions for the convergence of gradient descent apply to EM. This approach is not only loose, in that it does not capture that EM can make more progress than a gradient step, but the assumptions fail to hold for textbook examples of EM like Gaussian mixtures. In this work we first show that for the common setting of exponential family distributions, viewing EM as a mirror descent algorithm leads to convergence rates in Kullback-Leibler (KL) divergence. Then, we show how the KL divergence is related to first-order stationarity via Bregman divergences. In contrast to previous works, the analysis is invariant to the choice of parametrization and holds with minimal assumptions. We also show applications of these ideas to local linear (and superlinear) convergence rates, generalized EM, and non-exponential family distributions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题