没有重量对称的两条可扩展信用分配的路线

论文标题

没有重量对称的两条可扩展信用分配的路线

Two Routes to Scalable Credit Assignment without Weight Symmetry

论文作者

Kunin, Daniel, Nayebi, Aran, Sagastuy-Brena, Javier, Ganguli, Surya, Bloom, Jonathan M., Yamins, Daniel L. K.

论文摘要

长期以来，反向传播的神经合理性一直存在争议，主要是为了使用非本地重量转运$ - $，即生物学上可疑的要求，即一个神经元即时测量了另一个神经元的突触重量。直到最近，在大规模学习方案中，避免体重运输的本地学习规则的尝试通常已经失败了，在这些方案中，反向传播闪耀的情况，例如图像网与深卷积网络分类。在这里，我们研究了一项最近提出的本地学习规则，该规则可以通过反向传播产生竞争性能，并发现它对元拍摄的选择高度敏感，需要不费力的调整，而不会在网络体系结构上转移。我们的分析表明了这种不稳定性的基本数学原因，使我们能够确定一个更强大的本地学习规则，该规则可以更好地转移而无需元调整。但是，我们发现该本地规则与反向传播之间的性能和稳定差距随着模型深度的增加而扩大。然后，我们研究了几个非本地学习规则，这些规则放松了将瞬时重量传输到更可行的“权重估计”过程中的需求，这表明这些规则与深网的最新性能相匹配，并在嘈杂的更新存在下有效地运作。综上所述，我们的结果表明，发现神经实施的两条途径无需体重对称：进一步改进本地规则，以便它们跨体系结构始终如一地执行，并确定非本地学习机制的生物学实现。

The neural plausibility of backpropagation has long been disputed, primarily for its use of non-local weight transport $-$ the biologically dubious requirement that one neuron instantaneously measure the synaptic weights of another. Until recently, attempts to create local learning rules that avoid weight transport have typically failed in the large-scale learning scenarios where backpropagation shines, e.g. ImageNet categorization with deep convolutional networks. Here, we investigate a recently proposed local learning rule that yields competitive performance with backpropagation and find that it is highly sensitive to metaparameter choices, requiring laborious tuning that does not transfer across network architecture. Our analysis indicates the underlying mathematical reason for this instability, allowing us to identify a more robust local learning rule that better transfers without metaparameter tuning. Nonetheless, we find a performance and stability gap between this local rule and backpropagation that widens with increasing model depth. We then investigate several non-local learning rules that relax the need for instantaneous weight transport into a more biologically-plausible "weight estimation" process, showing that these rules match state-of-the-art performance on deep networks and operate effectively in the presence of noisy updates. Taken together, our results suggest two routes towards the discovery of neural implementations for credit assignment without weight symmetry: further improvement of local rules so that they perform consistently across architectures and the identification of biological implementations for non-local learning mechanisms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题