与Ritz-Galerkin方法的隐性偏见，了解解决PDE的深度学习

论文标题

与Ritz-Galerkin方法的隐性偏见，了解解决PDE的深度学习

Implicit bias with Ritz-Galerkin method in understanding deep learning for solving PDEs

论文作者

Wang, Jihong, Xu, Zhi-Qin John, Zhang, Jiwei, Zhang, Yaoyu

论文摘要

本文旨在研究Ritz-Galerkin（R-G）方法与深度神经网络（DNN）方法之间的差异，以解决偏微分方程（PDE），以更好地了解深度学习。为此，我们考虑解决一个特定的泊松问题，其中公式f的右侧信息仅在n个样本点可用，即在有限的样本点已知f。通过理论和数值研究，我们表明R-G方法的解决方案会收敛到一个维度（1D）问题的分段线性函数或对于高维问题的较低规律性的函数。但是，在相同的设置中，DNN会学习一个相对平滑的解决方案，无论尺寸如何，DNNS隐含地偏向具有所有函数之间更低频组件的函数，这些功能可以在可用的数据点适合方程。这种偏见是通过最近对频率原理的研究来解释的（Xu等，（2019）[17]和Zhang等，（2019）[11，19]）。 In addition to the similarity between the traditional numerical methods and DNNs in the approximation perspective, our work shows that the implicit bias in the learning process, which is different from traditional numerical methods, could help better understand the characteristics of DNNs.

This paper aims at studying the difference between Ritz-Galerkin (R-G) method and deep neural network (DNN) method in solving partial differential equations (PDEs) to better understand deep learning. To this end, we consider solving a particular Poisson problem, where the information of the right-hand side of the equation f is only available at n sample points, that is, f is known at finite sample points. Through both theoretical and numerical studies, we show that solution of the R-G method converges to a piecewise linear function for the one dimensional (1D) problem or functions of lower regularity for high dimensional problems. With the same setting, DNNs however learn a relative smooth solution regardless of the dimension, this is, DNNs implicitly bias towards functions with more low-frequency components among all functions that can fit the equation at available data points. This bias is explained by the recent study of frequency principle (Xu et al., (2019) [17] and Zhang et al., (2019) [11, 19]). In addition to the similarity between the traditional numerical methods and DNNs in the approximation perspective, our work shows that the implicit bias in the learning process, which is different from traditional numerical methods, could help better understand the characteristics of DNNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题