梯度下降需要神经网络和目标之间的初始对齐才能学习

论文标题

梯度下降需要神经网络和目标之间的初始对齐才能学习

An initial alignment between neural network and target is needed for gradient descent to learn

论文作者

Abbe, Emmanuel, Cornacchia, Elisabetta, Hązła, Jan, Marquis, Christopher

论文摘要

本文介绍了在初始化和目标函数时神经网络之间``初始对齐''（inal）的概念。事实证明，如果网络和布尔目标函数没有明显的意义，则在具有归一化I.I.D的完全连接的网络上嘈杂的梯度下降。初始化不会在多项式时间内学习。因此，在体系结构设计中需要有关目标（由INAL测量）的一定程度的知识。这也为[AS20]中提出的开放问题提供了答案。结果基于在对称神经网络上的下降算法的较低限制，而没有明确了解目标函数以外的目标函数。

This paper introduces the notion of ``Initial Alignment'' (INAL) between a neural network at initialization and a target function. It is proved that if a network and a Boolean target function do not have a noticeable INAL, then noisy gradient descent on a fully connected network with normalized i.i.d. initialization will not learn in polynomial time. Thus a certain amount of knowledge about the target (measured by the INAL) is needed in the architecture design. This also provides an answer to an open problem posed in [AS20]. The results are based on deriving lower-bounds for descent algorithms on symmetric neural networks without explicit knowledge of the target function beyond its INAL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题