跨二元联盟学习的吞吐量 - 最佳拓扑设计

论文标题

跨二元联盟学习的吞吐量 - 最佳拓扑设计

Throughput-Optimal Topology Design for Cross-Silo Federated Learning

论文作者

Marfoq, Othmane, Xu, Chuan, Neglia, Giovanni, Vidal, Richard

论文摘要

联合学习通常采用客户端服务器体系结构，在该架构中，编排迭代仪式从远程客户端汇总了模型更新，并将其推回了精致的模型。这种方法的跨索洛设置效率低下，因为具有高速访问链接的近距离数据孤岛可能比与编排者更快地交换信息，并且编排者可能会成为通信瓶颈。在本文中，我们使用最大线性系统理论来计算系统吞吐量的拓扑设计问题，以计算系统吞吐量 - 每次单位的通信巡回赛数量。我们还提出了实用算法，这些算法在可测量的网络特征的知识下，找到具有最大吞吐量或具有可证明的吞吐量保证的拓扑。在带有10 GBPS访问链接的现实Internet网络中，与主奴隶体系结构和最先进的抹茶相比，我们的算法将因子9和1.5加速训练。速度更大，访问链接较慢。

Federated learning usually employs a client-server architecture where an orchestrator iteratively aggregates model updates from remote clients and pushes them back a refined model. This approach may be inefficient in cross-silo settings, as close-by data silos with high-speed access links may exchange information faster than with the orchestrator, and the orchestrator may become a communication bottleneck. In this paper we define the problem of topology design for cross-silo federated learning using the theory of max-plus linear systems to compute the system throughput---number of communication rounds per time unit. We also propose practical algorithms that, under the knowledge of measurable network characteristics, find a topology with the largest throughput or with provable throughput guarantees. In realistic Internet networks with 10 Gbps access links for silos, our algorithms speed up training by a factor 9 and 1.5 in comparison to the master-slave architecture and to state-of-the-art MATCHA, respectively. Speedups are even larger with slower access links.

下载PDF全文

下载文献需遵守相关版权规定

论文标题