论文标题
在线培训通过刺激神经网络的时间
Online Training Through Time for Spiking Neural Networks
论文作者
论文摘要
尖峰神经网络(SNN)是有希望的大脑启发的节能模型。培训方法的最新进展使得成功的Deep SNN对具有低潜伏期的大规模任务。特别是,具有替代梯度(SG)的时间(BPTT)的反向传播通常用于在很少的时间步骤中实现高性能。但是,这是为了付费大量的记忆消耗,缺乏优化的理论清晰度以及与生物学学习的在线特性以及关于神经形态硬件的规则的不一致。其他作品将SNN的尖峰表示与等效的人工神经网络公式和训练SNN的尖峰表示,并通过等效映射的梯度来确保下降方向。但是他们无法达到低潜伏期,也没有在线。在这项工作中,我们建议通过时间(OTTT)进行在线培训,以通过BPTT得出,该培训是通过跟踪突触前活动并利用瞬时损失和梯度来实现前进学习的。同时,我们从理论上分析并证明OTTT的梯度可以为基于进料和经常性条件下的尖峰表示提供相似的下降方向作为优化作为梯度。 OTTT仅需要持续的培训记忆成本不可知到时间步骤,从而避免了BPTT在GPU培训中的大量内存成本。此外,OTTT的更新规则是三因素Hebbian学习的形式,这可以为在线芯片学习铺平道路。使用OTTT,这是首次将两种主流监督的SNN训练方法(具有SG和SPIKE代表训练的BPTT)连接起来,同时以生物学上合理的形式连接。在CIFAR-10,CIFAR-100,ImageNet和Cifar10-DVS上进行的实验证明了我们在小时步骤中在大规模静态和神经形态数据集上的卓越性能。
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models. Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency. Particularly, backpropagation through time (BPTT) with surrogate gradients (SG) is popularly used to achieve high performance in a very small number of time steps. However, it is at the cost of large memory consumption for training, lack of theoretical clarity for optimization, and inconsistency with the online property of biological learning and rules on neuromorphic hardware. Other works connect spike representations of SNNs with equivalent artificial neural network formulation and train SNNs by gradients from equivalent mappings to ensure descent directions. But they fail to achieve low latency and are also not online. In this work, we propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning by tracking presynaptic activities and leveraging instantaneous loss and gradients. Meanwhile, we theoretically analyze and prove that gradients of OTTT can provide a similar descent direction for optimization as gradients based on spike representations under both feedforward and recurrent conditions. OTTT only requires constant training memory costs agnostic to time steps, avoiding the significant memory costs of BPTT for GPU training. Furthermore, the update rule of OTTT is in the form of three-factor Hebbian learning, which could pave a path for online on-chip learning. With OTTT, it is the first time that two mainstream supervised SNN training methods, BPTT with SG and spike representation-based training, are connected, and meanwhile in a biologically plausible form. Experiments on CIFAR-10, CIFAR-100, ImageNet, and CIFAR10-DVS demonstrate the superior performance of our method on large-scale static and neuromorphic datasets in small time steps.