论文标题
EM算法的公平婚姻原理和初始化图
Fair Marriage Principle and Initialization Map for the EM Algorithm
论文作者
论文摘要
EM算法的流行融合理论解释说,观察到的不完整的数据对数类样l和完整的数据日志样式q呈正相关,我们可以通过最大化Q来最大化l。确定性退火EM(DAEM)算法是为了避免局部最大程度的理论。这是一个不合适的理论。 2)局部最大Q可以影响收敛速度,但不能阻止全局收敛; 3)像婚姻竞争一样,两个组成部分之间的不公平竞争可能会大大降低全球收敛速度; 4)存在本地融合,因为样本太小,并且存在不公平的竞争; 5)改进的EM算法,称为通道匹配(CM)EM算法,可以加速全局收敛。本文提供了一个初始化图,其中有两个均值作为两个轴,用于由Daem算法的作者研究的二进制高斯混合物的示例。该地图可以说明不同初始均值的收敛速度的速度以及某些区域中的点不适合初始点。二维示例表明,大样本或公平初始化可以避免全局收敛。对于更复杂的混合模型,我们需要进一步研究将公平的婚姻原则转换为初始化的特定方法。
The popular convergence theory of the EM algorithm explains that the observed incomplete data log-likelihood L and the complete data log-likelihood Q are positively correlated, and we can maximize L by maximizing Q. The Deterministic Annealing EM (DAEM) algorithm was hence proposed for avoiding locally maximal Q. This paper provides different conclusions: 1) The popular convergence theory is wrong; 2) The locally maximal Q can affect the convergent speed, but cannot block the global convergence; 3) Like marriage competition, unfair competition between two components may vastly decrease the globally convergent speed; 4) Local convergence exists because the sample is too small, and unfair competition exists; 5) An improved EM algorithm, called the Channel Matching (CM) EM algorithm, can accelerate the global convergence. This paper provides an initialization map with two means as two axes for the example of a binary Gaussian mixture studied by the authors of DAEM algorithm. This map can tell how fast the convergent speeds are for different initial means and why points in some areas are not suitable as initial points. A two-dimensional example indicates that the big sample or the fair initialization can avoid global convergence. For more complicated mixture models, we need further study to convert the fair marriage principle to specific methods for the initializations.