培训数十亿个参数图神经网络以进行原子模拟

论文标题

培训数十亿个参数图神经网络以进行原子模拟

Towards Training Billion Parameter Graph Neural Networks for Atomic Simulations

论文作者

Sriram, Anuroop, Das, Abhishek, Wood, Brandon M., Goyal, Siddharth, Zitnick, C. Lawrence

论文摘要

用于建模原子模拟的图形神经网络（GNN）的最新进展有可能彻底改变催化剂发现，这是迈向朝着应对气候变化所需的能量突破取得进展的关键步骤。但是，事实证明，对这项任务最有效的GNN是内存密集型的，因为它们在图表中建模高阶相互作用，例如三胞胎或原子的四倍体之间的相互作用，因此扩展这些模型的范围是挑战性的。在本文中，我们介绍了图形并行性，这是一种在多个GPU上分布输入图的方法，使我们能够训练具有数亿或数十亿个参数的非常大的GNN。我们通过超出最近提出的Dimenet ++和Gemnet模型的参数数量来评估我们的方法。在大规模开放催化剂2020（OC20）数据集上，这些图形合行的模型导致S2EF任务的力量MAE指标的相对改善为1）15％，而IS2RS任务的AFBT指标为2）21％，建立了新的目前的结果。

Recent progress in Graph Neural Networks (GNNs) for modeling atomic simulations has the potential to revolutionize catalyst discovery, which is a key step in making progress towards the energy breakthroughs needed to combat climate change. However, the GNNs that have proven most effective for this task are memory intensive as they model higher-order interactions in the graphs such as those between triplets or quadruplets of atoms, making it challenging to scale these models. In this paper, we introduce Graph Parallelism, a method to distribute input graphs across multiple GPUs, enabling us to train very large GNNs with hundreds of millions or billions of parameters. We empirically evaluate our method by scaling up the number of parameters of the recently proposed DimeNet++ and GemNet models by over an order of magnitude. On the large-scale Open Catalyst 2020 (OC20) dataset, these graph-parallelized models lead to relative improvements of 1) 15% on the force MAE metric for the S2EF task and 2) 21% on the AFbT metric for the IS2RS task, establishing new state-of-the-art results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题