论文标题
预测编码沿任意计算图近似反向版
Predictive Coding Approximates Backprop along Arbitrary Computation Graphs
论文作者
论文摘要
错误的反向传播(BackProp)是通过端到端差异训练机器学习体系结构的强大算法。但是,经常因缺乏生物学合理性而受到批评。最近,已经表明,可以使用预测性编码(一种仅依赖于局部和HEBBIAN更新的皮质计算的生物学上可见的过程理论,可以使用预测编码来近似多层 - plyptrons(MLP)的反射。但是,BackProp的功能不在于其在MLP中的实例化,而是自动分化的概念,该概念允许优化任何以计算图表示的可区分程序。在这里,我们证明,预测性编码仅使用本地学习规则在任意计算图上渐近地(实际上和实际上迅速)收敛到确切的反向梯度。我们将此结果应用于制定直接的策略,以将核心机器学习体系结构转化为相当的预测编码。我们构建了预测性编码CNN,RNN和更复杂的LSTM,其中包括非层状分支内部图结构和乘法相互作用。我们的模型等同于反对具有挑战性的机器学习基准测试,同时仅利用本地和(主要)Hebbian可塑性。我们的方法提高了标准机器学习算法原则上可以直接在神经回路中实施的潜力,并且还可能有助于开发完全分布的神经形态架构。
Backpropagation of error (backprop) is a powerful algorithm for training machine learning architectures through end-to-end differentiation. However, backprop is often criticised for lacking biological plausibility. Recently, it has been shown that backprop in multilayer-perceptrons (MLPs) can be approximated using predictive coding, a biologically-plausible process theory of cortical computation which relies only on local and Hebbian updates. The power of backprop, however, lies not in its instantiation in MLPs, but rather in the concept of automatic differentiation which allows for the optimisation of any differentiable program expressed as a computation graph. Here, we demonstrate that predictive coding converges asymptotically (and in practice rapidly) to exact backprop gradients on arbitrary computation graphs using only local learning rules. We apply this result to develop a straightforward strategy to translate core machine learning architectures into their predictive coding equivalents. We construct predictive coding CNNs, RNNs, and the more complex LSTMs, which include a non-layer-like branching internal graph structure and multiplicative interactions. Our models perform equivalently to backprop on challenging machine learning benchmarks, while utilising only local and (mostly) Hebbian plasticity. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry, and may also contribute to the development of completely distributed neuromorphic architectures.