论文标题
物理学指导的问题分解用于扩展高维本质级别的深度学习:Schrödinger方程的情况
Physics-Guided Problem Decomposition for Scaling Deep Learning of High-dimensional Eigen-Solvers: The Case of Schrödinger's Equation
论文作者
论文摘要
鉴于他们有效地学习非线性映射并进行快速推断的能力,已提出深层神经网络(NNS)作为解决高维本质值方程(HDES)的可行替代方法,这些方法是许多科学应用的基础。不幸的是,对于这些科学应用中学习的模型以实现概括,通常需要一个大型,多样化,最好的注释数据集,并且可以获得计算昂贵。此外,学到的模型倾向于是记忆和计算密集型,这主要是由于输出层的大小。尽管通过以物理损失的形式施加物理约束来尝试概括,尤其是推断,但仍存在稀缺数据,但模型可伸缩性的问题仍然存在。 在本文中,我们通过使用物理知识来分解复杂的回归任务,从而减轻输出层中的计算瓶颈,以将高维征性向量预测到多个更简单的子任务中,每个子任务都是由简单的“专家”网络学习的。我们称之为专业专家物理指导的专家混合物(PG-MOE)的结构。我们证明了这种物理学引导问题分解对量子力学中施罗丁方程的情况的功效。我们提出的PG-MOE模型预测了基础溶液,即与最小可能的特征值相对应的特征向量。该模型比受过训练的网络小150倍,在概括性上具有竞争力。为了改善PG-MOE的概括,我们还采用了基于变异能量的物理学引导的损失函数,该损失函数通过量子力学原理将其最小化,如果输出为基础态解决方案。
Given their ability to effectively learn non-linear mappings and perform fast inference, deep neural networks (NNs) have been proposed as a viable alternative to traditional simulation-driven approaches for solving high-dimensional eigenvalue equations (HDEs), which are the foundation for many scientific applications. Unfortunately, for the learned models in these scientific applications to achieve generalization, a large, diverse, and preferably annotated dataset is typically needed and is computationally expensive to obtain. Furthermore, the learned models tend to be memory- and compute-intensive primarily due to the size of the output layer. While generalization, especially extrapolation, with scarce data has been attempted by imposing physical constraints in the form of physics loss, the problem of model scalability has remained. In this paper, we alleviate the compute bottleneck in the output layer by using physics knowledge to decompose the complex regression task of predicting the high-dimensional eigenvectors into multiple simpler sub-tasks, each of which are learned by a simple "expert" network. We call the resulting architecture of specialized experts Physics-Guided Mixture-of-Experts (PG-MoE). We demonstrate the efficacy of such physics-guided problem decomposition for the case of the Schrödinger's Equation in Quantum Mechanics. Our proposed PG-MoE model predicts the ground-state solution, i.e., the eigenvector that corresponds to the smallest possible eigenvalue. The model is 150x smaller than the network trained to learn the complex task while being competitive in generalization. To improve the generalization of the PG-MoE, we also employ a physics-guided loss function based on variational energy, which by quantum mechanics principles is minimized iff the output is the ground-state solution.