论文标题
用于训练机器学习潜力的简单有效算法强制数据
Simple and efficient algorithms for training machine learning potentials to force data
论文作者
论文摘要
从头量量子模拟的数据培训的抽象机器学习模型正在以前所未有的精度产生分子动力学电位。一个限制因素是可用培训数据的数量,这可能很昂贵。量子模拟除了系统的总能量外,通常还提供所有原子力。这些力提供的信息比单独的能量更多。看来,训练模型对大量的力数据会引入大量的计算成本。实际上,对所有可用力量数据的培训应仅比单独训练能量要贵几倍。在这里,我们提出了一种用于有效的力训练的新算法,并通过训练对有机化学和块状铝的力量进行训练来基准其准确性。
Abstract Machine learning models, trained on data from ab initio quantum simulations, are yielding molecular dynamics potentials with unprecedented accuracy. One limiting factor is the quantity of available training data, which can be expensive to obtain. A quantum simulation often provides all atomic forces, in addition to the total energy of the system. These forces provide much more information than the energy alone. It may appear that training a model to this large quantity of force data would introduce significant computational costs. Actually, training to all available force data should only be a few times more expensive than training to energies alone. Here, we present a new algorithm for efficient force training, and benchmark its accuracy by training to forces from real-world datasets for organic chemistry and bulk aluminum.