论文标题

部分可观测时空混沌系统的无模型预测

Quantum chemical roots of machine-learning molecular similarity descriptors

论文作者

Gugler, Stefan, Reiher, Markus

论文摘要

在这项工作中,我们探讨了描述子的量子化学基础,以获得分子相似性。此类描述符是通过机器学习遍历化学复合空间的关键。我们的重点是库仑基质和原子位置的平滑重叠(肥皂)。我们采用一个基本框架,使我们能够将两个描述符连接到电子结构理论。该框架使我们能够定义两个与电子结构理论更紧密相关的新描述符,我们称它们为库仑列表和电子密度的平滑重叠(SOED)。通过研究它们作为分子相似性描述符的有用性,我们就如何以及为何库仑基质和肥皂起作用获得了新的见解。此外,库仑列表避免了库仑矩阵的某种神秘的对角线化步骤,并可能会提供一种直接的手段来提取可以在变化尺寸的Born-Oppenheimer表面上进行比较的子系统信息。对于电子密度,我们得出了必要的形式主义,以与肥皂密切类似地创建SOED度量。由于这种形式主义比肥皂更需要参与,因此我们回顾了基本理论,但也引入了一系列近似值,最终使我们能够与SOED一起使用可用于评估肥皂的相同实施。我们将分析重点放在基本反应步骤上,其中过渡状态结构与反应物或产品结构更相似,而后两者相对于彼此而言。但是,由于多种配置效应,过渡状态结构的电子能的预测可能比稳定中间体的预测更加困难。这个问题出现了植根于电子结构理论的分子相似性描述在多大程度上可以解决这些复杂的效果。

In this work, we explore the quantum chemical foundations of descriptors for molecular similarity. Such descriptors are key for traversing chemical compound space with machine learning. Our focus is on the Coulomb matrix and on the smooth overlap of atomic positions (SOAP). We adopt a basic framework that allows us to connect both descriptors to electronic structure theory. This framework enables us then to define two new descriptors that are more closely related to electronic structure theory, which we call Coulomb lists and smooth overlap of electron densities (SOED). By investigating their usefulness as molecular similarity descriptors, we gain new insights in how and why Coulomb matrix and SOAP work. Moreover, Coulomb lists avoid the somewhat mysterious diagonalization step of the Coulomb matrix and might provide a direct means to extract subsystem information that can be compared across Born-Oppenheimer surfaces of varying dimension. For the electron density we derive the necessary formalism to create the SOED measure in close analogy to SOAP. Since this formalism is more involved than that of SOAP, we review the essential theory, but also introduce a set of approximations that eventually allow us to work with SOED in terms of the same implementation available for the evaluation of SOAP. We focus our analysis on elementary reaction steps, where transition state structures are more similar to either reactant or product structures than the latter two are with respect to one another. The prediction of electronic energies of transition state structures can, however, be more difficult than that of stable intermediates due to multi-configurational effects. The question arises to what extent molecular similarity descriptors rooted in electronic structure theory can resolve these intricate effects.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源