论文标题
委员会神经网络潜力控制概括错误并使积极学习
Committee neural network potentials control generalization errors and enable active learning
论文作者
论文摘要
在机器学习领域众所周知,委员会模型提高了准确性,提供概括误差估计并实现主动学习策略。在这项工作中,我们将这些概念调整为基于人工神经网络的原子势。共享相同原子环境描述符的多个模型代替单个模型,产生的平均值超过了其单个成员以及委员会分歧形式的概括误差的度量。我们不仅使用这种分歧来识别最相关的配置,以在主动学习过程中构建模型的训练设置,而且还可以在模拟过程中监视和偏向于控制概括错误。这促进了委员会神经网络潜力及其培训集的适应性发展,同时将AB的数量保持在最低限度。为了说明这种方法的好处,我们将其应用于凝结阶段委员会水模型的发展。从单个参考初始模拟开始,我们使用主动学习来扩展到新的状态点并描述核的量子性质。经过814个参考计算的训练的最终模型在一系列条件下产生了良好的结果,从环境水和升高的温度和压力到冰的不同阶段以及空气水界面 - 所有这些都包括核量子效应。这种委员会模型的方法将使强大的机器学习模型的系统开发为广泛的系统。
It is well known in the field of machine learning that committee models improve accuracy, provide generalization error estimates, and enable active learning strategies. In this work, we adapt these concepts to interatomic potentials based on artificial neural networks. Instead of a single model, multiple models that share the same atomic environment descriptors yield an average that outperforms its individual members as well as a measure of the generalization error in the form of the committee disagreement. We not only use this disagreement to identify the most relevant configurations to build up the model's training set in an active learning procedure, but also monitor and bias it during simulations to control the generalization error. This facilitates the adaptive development of committee neural network potentials and their training sets, while keeping the number of ab initio calculations to a minimum. To illustrate the benefits of this methodology, we apply it to the development of a committee model for water in the condensed phase. Starting from a single reference ab initio simulation, we use active learning to expand into new state points and to describe the quantum nature of the nuclei. The final model, trained on 814 reference calculations, yields excellent results under a range of conditions, from liquid water at ambient and elevated temperatures and pressures to different phases of ice, and the air-water interface - all including nuclear quantum effects. This approach to committee models will enable the systematic development of robust machine learning models for a broad range of systems.