论文标题
通过符号回归发现的原子质潜在模型的功能形式的概括性
Generalizability of Functional Forms for Interatomic Potential Models Discovered by Symbolic Regression
论文作者
论文摘要
近年来,在使用机器学习算法来开发原子间潜在模型方面取得了长足的进步。机器学习的潜在模型通常比密度功能理论快的阶数阶数,但比物理衍生的模型(例如嵌入式原子方法)慢得多。在我们以前的工作中,我们使用符号回归来开发快速,准确和可转移的铜模型,用于铜的新型功能形式,类似于嵌入原子方法的功能形式。为了确定这些形式的成功特定于铜的程度,我们在这里探讨了这些模型对其他以面部为中心的立方过渡金属的普遍性,并分析了它们在几种材料特性上的样本外部性能。我们发现,这些形式在化学上与铜化学相似的元素上特别有效。与具有相似复杂性的优化的Sutton-Chen模型相比,使用符号回归发现的功能形式在所有被认为具有相似性能的元素之外的元素中表现更好。它们的性能类似于对训练的属性的中等更复杂的嵌入原子形式,并且在其他属性上平均更准确。我们将这种提高的广义精度归因于使用符号回归发现的模型的相对简单性。在各种属性预测中,遗传编程模型的表现大约超过了50%的文献模型,平均大约是1/10的模型。我们讨论了这些结果对符号回归对新电位开发的更广泛应用的含义,并突出显示了如何使用一个元素发现的模型来播种不同元素的新搜索。
In recent years there has been great progress in the use of machine learning algorithms to develop interatomic potential models. Machine-learned potential models are typically orders of magnitude faster than density functional theory but also orders of magnitude slower than physics-derived models such as the embedded atom method. In our previous work, we used symbolic regression to develop fast, accurate and transferrable interatomic potential models for copper with novel functional forms that resemble those of the embedded atom method. To determine the extent to which the success of these forms was specific to copper, here we explore the generalizability of these models to other face-centered cubic transition metals and analyze their out-of-sample performance on several material properties. We found that these forms work particularly well on elements that are chemically similar to copper. When compared to optimized Sutton-Chen models, which have similar complexity, the functional forms discovered using symbolic regression perform better across all elements considered except gold where they have a similar performance. They perform similarly to a moderately more complex embedded atom form on properties on which they were trained, and they are more accurate on average on other properties. We attribute this improved generalized accuracy to the relative simplicity of the models discovered using symbolic regression. The genetic programming models are found to outperform other models from the literature about 50% of the time in a variety of property predictions, with about 1/10th the model complexity on average. We discuss the implications of these results to the broader application of symbolic regression to the development of new potentials and highlight how models discovered for one element can be used to seed new searches for different elements.