论文标题
学习学分分配
Learning credit assignment
论文作者
论文摘要
深度学习已经在各种科学和工业领域中实现了令人印象深刻的预测准确性。但是,深度学习的嵌套非线性特征使学习高度非透明剂,即,学习如何协调大量参数以实现决策。为了解释这一层次的信用分配,我们通过假设子网络而不是单个网络的合奏进行分类任务,提出了一个平均田野学习模型。令人惊讶的是,我们的模型表明,除了在相邻层上连接两个神经元的一些确定性突触权重外,还有大量连接可能不存在,其他连接可以允许其重量值的广泛分布。因此,突触连接可以分为三类:非常重要的连接,不重要的连接以及可能部分编码滋扰因素的可变性。因此,我们的模型学习了导致决定的信用分配,并预测了可以完成相同任务的子网络的集合,从而提供了见解,以通过突触体重的不同角色来理解深度学习的宏观行为。
Deep learning has achieved impressive prediction accuracies in a variety of scientific and industrial domains. However, the nested non-linear feature of deep learning makes the learning highly non-transparent, i.e., it is still unknown how the learning coordinates a huge number of parameters to achieve a decision making. To explain this hierarchical credit assignment, we propose a mean-field learning model by assuming that an ensemble of sub-networks, rather than a single network, are trained for a classification task. Surprisingly, our model reveals that apart from some deterministic synaptic weights connecting two neurons at neighboring layers, there exist a large number of connections that can be absent, and other connections can allow for a broad distribution of their weight values. Therefore, synaptic connections can be classified into three categories: very important ones, unimportant ones, and those of variability that may partially encode nuisance factors. Therefore, our model learns the credit assignment leading to the decision, and predicts an ensemble of sub-networks that can accomplish the same task, thereby providing insights toward understanding the macroscopic behavior of deep learning through the lens of distinct roles of synaptic weights.