量化机器学习模型在材料发现中的性能

论文标题

量化机器学习模型在材料发现中的性能

Quantifying the performance of machine learning models in materials discovery

论文作者

Borg, Christopher K. H., Muckley, Eric S., Nyby, Clara, Saal, James E., Ward, Logan, Mehta, Apurva, Meredig, Bryce

论文摘要

材料发现中使用的机器学习（ML）模型的预测能力通常是使用简单统计数据（例如根平方误差（RMSE））或ML预测材料属性值及其已知值之间的确定系数（$ r^2 $）来测量的。一个诱人的假设是，误差较低的模型应在引导材料发现方面有效，相反，误差高的模型应具有较差的发现性能。但是，我们观察到，在整个训练集（例如RMSE）中平均的“静态”数量之间没有明确的联系，以及ML属性模型动态指导具有目标特性的新型材料的迭代（且通常是外推）发现的能力。在这项工作中，我们模拟了一个顺序学习（SL）引导的材料发现过程，并在指导材料发现中演示了传统模型误差指标与模型性能之间的脱钩。我们表明，材料发现中的模型性能在很大程度上取决于（1）属性分布中的目标范围（例如，是否需要第一或第10分的材料）；（2）在SL采集函数中纳入不确定性估计值；（3）科学家是否对一个发现感兴趣或许多目标；（4）允许进行多少个SL迭代。为了克服静态指标的局限性并稳健地捕获SL性能，我们建议诸如发现收益率（$ dy $）之类的指标，该指标衡量了SL期间发现了多少高性能材料，以及发现概率（$ DP $），这是SL过程中任何位置发现高效材料的可能性。

The predictive capabilities of machine learning (ML) models used in materials discovery are typically measured using simple statistics such as the root-mean-square error (RMSE) or the coefficient of determination ($r^2$) between ML-predicted materials property values and their known values. A tempting assumption is that models with low error should be effective at guiding materials discovery, and conversely, models with high error should give poor discovery performance. However, we observe that no clear connection exists between a "static" quantity averaged across an entire training set, such as RMSE, and an ML property model's ability to dynamically guide the iterative (and often extrapolative) discovery of novel materials with targeted properties. In this work, we simulate a sequential learning (SL)-guided materials discovery process and demonstrate a decoupling between traditional model error metrics and model performance in guiding materials discoveries. We show that model performance in materials discovery depends strongly on (1) the target range within the property distribution (e.g., whether a 1st or 10th decile material is desired); (2) the incorporation of uncertainty estimates in the SL acquisition function; (3) whether the scientist is interested in one discovery or many targets; and (4) how many SL iterations are allowed. To overcome the limitations of static metrics and robustly capture SL performance, we recommend metrics such as Discovery Yield ($DY$), a measure of how many high-performing materials were discovered during SL, and Discovery Probability ($DP$), a measure of likelihood of discovering high-performing materials at any point in the SL process.

下载PDF全文

下载文献需遵守相关版权规定

论文标题