大脑模拟的低延迟沟通设计

论文标题

大脑模拟的低延迟沟通设计

A Low-latency Communication Design for Brain Simulations

论文作者

Du, Xin

论文摘要

作为人工智能的最新进展之一，大脑模拟有助于更好地了解信息在大脑中的代表和处理。人脑的极端复杂性使大脑模拟仅在高性能计算平台上可行。目前，具有大量互连图形处理单元（GPU）的超级计算机用于支持大脑模拟。因此，超级计算机中的高通量低延迟间GPU通信在满足脑模拟的性能要求方面起着至关重要的作用。在本文中，我们首先概述了使用多GPU体系结构的当前并行化技术进行脑模拟。然后，我们分析了脑模拟通信的挑战，并总结了解决此类挑战的通信设计指南。此外，我们提出了一种分区算法和两级路由方法，以在多GPU体系结构中实现有效的低延迟通信以进行大脑模拟。我们报告在具有2,000 GPU的超级计算机上获得的实验结果，以模拟具有100亿个神经元的大脑模型，以表明我们的方法可以显着提高沟通性能。我们还讨论了开放问题，并确定了一些用于大脑模拟的低延迟通信设计的研究方向。

Brain simulation, as one of the latest advances in artificial intelligence, facilitates better understanding about how information is represented and processed in the brain. The extreme complexity of human brain makes brain simulations only feasible upon high-performance computing platforms. Supercomputers with a large number of interconnected graphical processing units (GPUs) are currently employed for supporting brain simulations. Therefore, high-throughput low-latency inter-GPU communications in supercomputers play a crucial role in meeting the performance requirements of brain simulation as a highly time-sensitive application. In this paper, we first provide an overview of the current parallelizing technologies for brain simulations using multi-GPU architectures. Then, we analyze the challenges to communications for brain simulation and summarize guidelines for communication design to address such challenges. Furthermore, we propose a partitioning algorithm and a two-level routing method to achieve efficient low-latency communications in multi-GPU architecture for brain simulation. We report experiment results obtained on a supercomputer with 2,000 GPUs for simulating a brain model with 10 billion neurons to show that our approach can significantly improve communication performance. We also discuss open issues and identify some research directions for low-latency communication design for brain simulations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题