使用动量加速共识分散的深度学习

论文标题

使用动量加速共识分散的深度学习

Decentralized Deep Learning using Momentum-Accelerated Consensus

论文作者

Balu, Aditya, Jiang, Zhanhong, Tan, Sin Yong, Hedge, Chinmay, Lee, Young M, Sarkar, Soumik

论文摘要

我们考虑了分散深度学习的问题，其中多个代理协作从分布式数据集中学习。尽管存在几种分散的深度学习方法，但大多数人认为中央参数 - 服务器拓扑用于汇总代理的模型参数。但是，这样的拓扑可能在网络系统中不适用，例如临时移动网络，现场机器人技术和电力网络系统，在这些系统中，与中央参数服务器的直接通信可能效率低下。在这种情况下，我们建议和分析一种新颖的分散深度学习算法，其中代理通过固定通信拓扑（没有中央服务器）进行互动。我们的算法基于用于基于梯度的优化的重球加速方法。我们提出了一个新的共识协议，其中每个代理都与邻居共享其模型参数以及在优化过程中梯度摩托明值。我们考虑强烈凸和非凸的目标函数，理论上分析了算法的性能。我们与竞争性的分散学习方法进行了几种经验比较，以证明我们在不同的交流拓扑结构下的方法的功效。

We consider the problem of decentralized deep learning where multiple agents collaborate to learn from a distributed dataset. While there exist several decentralized deep learning approaches, the majority consider a central parameter-server topology for aggregating the model parameters from the agents. However, such a topology may be inapplicable in networked systems such as ad-hoc mobile networks, field robotics, and power network systems where direct communication with the central parameter server may be inefficient. In this context, we propose and analyze a novel decentralized deep learning algorithm where the agents interact over a fixed communication topology (without a central server). Our algorithm is based on the heavy-ball acceleration method used in gradient-based optimization. We propose a novel consensus protocol where each agent shares with its neighbors its model parameters as well as gradient-momentum values during the optimization process. We consider both strongly convex and non-convex objective functions and theoretically analyze our algorithm's performance. We present several empirical comparisons with competing decentralized learning methods to demonstrate the efficacy of our approach under different communication topologies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题