论文标题
培训高性能的低延迟尖峰神经网络,通过峰值表示
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation
论文作者
论文摘要
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. However, it is a challenge to efficiently train SNNs due to their non-differentiability.大多数现有方法要么患有高潜伏期(即长期模拟时间步骤),要么无法达到人工神经网络(ANN)的高性能。在本文中,我们提出了峰值表示(DSR)方法的差异化,该方法可以实现与延迟较低的ANN竞争的高性能。 First, we encode the spike trains into spike representation using (weighted) firing rate coding.基于尖峰表示,我们从系统地得出了具有常见神经模型的尖峰动力学可以表示为某些亚差异映射。从这个角度来看,我们提出的DSR方法通过映射梯度训练SNN,并避免了SNN培训中常见的非差异性问题。 Then we analyze the error when representing the specific mapping with the forward computation of the SNN. To reduce such error, we propose to train the spike threshold in each layer, and to introduce a new hyperparameter for the neural models.使用这些组件,DSR方法可以在静态和神经形态数据集上以低潜伏期(包括CIFAR-10,CIFAR-10,CIFAR-100,IMAGENET和DVS-CIFAR10)实现最先进的SNN性能。
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. However, it is a challenge to efficiently train SNNs due to their non-differentiability. Most existing methods either suffer from high latency (i.e., long simulation time steps), or cannot achieve as high performance as Artificial Neural Networks (ANNs). In this paper, we propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance that is competitive to ANNs yet with low latency. First, we encode the spike trains into spike representation using (weighted) firing rate coding. Based on the spike representation, we systematically derive that the spiking dynamics with common neural models can be represented as some sub-differentiable mapping. With this viewpoint, our proposed DSR method trains SNNs through gradients of the mapping and avoids the common non-differentiability problem in SNN training. Then we analyze the error when representing the specific mapping with the forward computation of the SNN. To reduce such error, we propose to train the spike threshold in each layer, and to introduce a new hyperparameter for the neural models. With these components, the DSR method can achieve state-of-the-art SNN performance with low latency on both static and neuromorphic datasets, including CIFAR-10, CIFAR-100, ImageNet, and DVS-CIFAR10.