论文标题
信息理论和变异推断的平方股权享受总和
Sum-of-Squares Relaxations for Information Theory and Variational Inference
论文作者
论文摘要
我们考虑了香农相对熵的扩展,称为$ f $ -Diverences。三个经典相关的计算问题通常与这些差异相关:(a)从矩,(b)计算正常化积分的估计,以及(c)概率模型中的变异推理。这些问题是通过凸双重性相互关联的,对于所有数据,数据科学都有许多应用,我们的目标是可计算可触及的近似算法,这些算法保留了原始问题的性质,例如潜在的凸性或单调性。为了实现这一目标,我们得出了一系列凸出放松的序列,用于计算与给定特征向量相关的非中心协方差矩阵的这些差异:从典型的不可吸引的最佳下限开始,我们考虑基于````''''基于```'''''的额外放松。我们还基于量子信息理论的频谱信息差异提供计算上更有效的放松。对于上述所有任务,除了提出新的放松外,我们还会得出可拖动的凸优化算法,并且我们介绍了布尔超立方体上的多元三角多项式和功能的插图。
We consider extensions of the Shannon relative entropy, referred to as $f$-divergences.Three classical related computational problems are typically associated with these divergences: (a) estimation from moments, (b) computing normalizing integrals, and (c) variational inference in probabilistic models. These problems are related to one another through convex duality, and for all them, there are many applications throughout data science, and we aim for computationally tractable approximation algorithms that preserve properties of the original problem such as potential convexity or monotonicity. In order to achieve this, we derive a sequence of convex relaxations for computing these divergences from non-centered covariance matrices associated with a given feature vector: starting from the typically non-tractable optimal lower-bound, we consider an additional relaxation based on ``sums-of-squares'', which is is now computable in polynomial time as a semidefinite program. We also provide computationally more efficient relaxations based on spectral information divergences from quantum information theory. For all of the tasks above, beyond proposing new relaxations, we derive tractable convex optimization algorithms, and we present illustrations on multivariate trigonometric polynomials and functions on the Boolean hypercube.