论文标题
通过展开的高速公路期望最大化改善梯度流量
Improving Gradient Flow with Unrolled Highway Expectation Maximization
论文作者
论文摘要
将基于模型的机器学习方法集成到深度神经体系结构中,可以利用深神经网的表达能力以及基于模型的方法合并特定于领域的知识的能力。特别是,许多作品都采用了期望最大化(EM)算法的形式,其形式是通过骨干神经网络共同训练的外层结构。但是,很难通过通过迭代来反复传播来训练骨干网络,因为它们容易出现消失的梯度问题。为了解决这个问题,我们提出了高速公路期望最大化网络(HEMNET),该网络由基于Newton-Rahpson方法的广义EM(GEM)算法的展开迭代组成。 Hemnet沿展开的体系结构的深度具有缩放的跳过连接或高速公路,从而在反向传播过程中改善了梯度流动,同时与标准展开的EM相比会产生可忽略不计的其他计算和记忆成本。此外,Hemnet保留了基本的EM程序,从而完全保留了原始EM算法的收敛性。我们在几个语义分割基准上的性能取得了显着改善,并从经验上表明,Hemnet有效地减轻了梯度衰变。
Integrating model-based machine learning methods into deep neural architectures allows one to leverage both the expressive power of deep neural nets and the ability of model-based methods to incorporate domain-specific knowledge. In particular, many works have employed the expectation maximization (EM) algorithm in the form of an unrolled layer-wise structure that is jointly trained with a backbone neural network. However, it is difficult to discriminatively train the backbone network by backpropagating through the EM iterations as they are prone to the vanishing gradient problem. To address this issue, we propose Highway Expectation Maximization Networks (HEMNet), which is comprised of unrolled iterations of the generalized EM (GEM) algorithm based on the Newton-Rahpson method. HEMNet features scaled skip connections, or highways, along the depths of the unrolled architecture, resulting in improved gradient flow during backpropagation while incurring negligible additional computation and memory costs compared to standard unrolled EM. Furthermore, HEMNet preserves the underlying EM procedure, thereby fully retaining the convergence properties of the original EM algorithm. We achieve significant improvement in performance on several semantic segmentation benchmarks and empirically show that HEMNet effectively alleviates gradient decay.