论文标题
神经分类器的动态稳定的无限宽度极限
Dynamically Stable Infinite-Width Limits of Neural Classifiers
论文作者
论文摘要
最近的研究集中在研究无限宽度极限(1)平均场(MF)和(2)恒定神经切线内核(NTK)近似值的两种不同方法上。这两种方法具有与网络层宽度的超参数的不同缩放标度,结果是不同的无限宽度限制模型。我们提出了一个一般框架,以研究神经模型的极限行为如何取决于网络宽度的超参数的缩放。我们的框架使我们能够为现有的MF和NTK限制得出缩放,以及导致相应模型的动态稳定限制行为的其他数量的其他尺度。但是,这些量表仅诱导有限数量的不同限制模型。每个独特的极限模型都对应于诸如逻辑的界限和切线核的唯一组合或切线核的正常性。现有的MF和NTK极限模型以及一个新型的极限模型满足了有限宽度模型所证明的大多数属性。我们还提出了一种满足上述所有属性的新型初始化校正的均值限制,其相应的模型是对有限宽度模型的简单修改。
Recent research has been focused on two different approaches to studying neural networks training in the limit of infinite width (1) a mean-field (MF) and (2) a constant neural tangent kernel (NTK) approximations. These two approaches have different scaling of hyperparameters with the width of a network layer and as a result, different infinite-width limit models. We propose a general framework to study how the limit behavior of neural models depends on the scaling of hyperparameters with network width. Our framework allows us to derive scaling for existing MF and NTK limits, as well as an uncountable number of other scalings that lead to a dynamically stable limit behavior of corresponding models. However, only a finite number of distinct limit models are induced by these scalings. Each distinct limit model corresponds to a unique combination of such properties as boundedness of logits and tangent kernels at initialization or stationarity of tangent kernels. Existing MF and NTK limit models, as well as one novel limit model, satisfy most of the properties demonstrated by finite-width models. We also propose a novel initialization-corrected mean-field limit that satisfies all properties noted above, and its corresponding model is a simple modification for a finite-width model.