论文标题
Finetuna:微调加速分子模拟
FINETUNA: Fine-tuning Accelerated Molecular Simulations
论文作者
论文摘要
机器学习方法有可能以计算有效的方式近似于原子模拟的密度功能理论(DFT),这可能会大大增加计算模拟对现实世界问题的影响。但是,它们受到其准确性和生成标记数据的成本的限制。在这里,我们提出了一个在线主动学习框架,该框架通过合并了从Open Catalyst Project中通过大规模预训练的图形神经网络模型学习的先前的物理信息,从而有效,准确地加速了原子系统的模拟。加速这些仿真使有用的数据更便宜地生成,从而可以训练更好的模型,并可以筛选更多的原子系统。我们还提出了一种基于其速度和准确性比较局部优化技术的方法。在30个基准吸附物催化剂系统上进行的实验表明,我们的转移学习方法从预训练模型中合并了先前的信息,通过将DFT计算的数量减少91%,同时达到0.02 EV的准确性阈值93%,从而加速了模拟。最后,我们演示了一种技术,用于利用VAS中内置的互动功能,以在我们的在线主动学习框架内有效地计算单点计算,而无需大量启动成本。这使VASP与我们的框架同时起作用,同时需要比常规的单点计算要少的自洽周期75%。在GitHub的开源Finetuna软件包中可用在线主动学习实现以及使用VASP交互式代码的示例。
Machine learning approaches have the potential to approximate Density Functional Theory (DFT) for atomistic simulations in a computationally efficient manner, which could dramatically increase the impact of computational simulations on real-world problems. However, they are limited by their accuracy and the cost of generating labeled data. Here, we present an online active learning framework for accelerating the simulation of atomic systems efficiently and accurately by incorporating prior physical information learned by large-scale pre-trained graph neural network models from the Open Catalyst Project. Accelerating these simulations enables useful data to be generated more cheaply, allowing better models to be trained and more atomistic systems to be screened. We also present a method of comparing local optimization techniques on the basis of both their speed and accuracy. Experiments on 30 benchmark adsorbate-catalyst systems show that our method of transfer learning to incorporate prior information from pre-trained models accelerates simulations by reducing the number of DFT calculations by 91%, while meeting an accuracy threshold of 0.02 eV 93% of the time. Finally, we demonstrate a technique for leveraging the interactive functionality built in to VASP to efficiently compute single point calculations within our online active learning framework without the significant startup costs. This allows VASP to work in tandem with our framework while requiring 75% fewer self-consistent cycles than conventional single point calculations. The online active learning implementation, and examples using the VASP interactive code, are available in the open source FINETUNA package on Github.