Sparsetrain：利用数据流稀疏性进行有效的卷积神经网络培训

论文标题

Sparsetrain：利用数据流稀疏性进行有效的卷积神经网络培训

SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training

论文作者

Dai, Pengcheng, Yang, Jianlei, Ye, Xucheng, Cheng, Xingzhou, Luo, Junyu, Song, Linghao, Chen, Yiran, Zhao, Weisheng

论文摘要

培训卷积神经网络（CNN）通常需要大量的计算资源。在本文中，提议通过充分利用稀疏性来加速CNN训练。它主要涉及三个级别的创新：激活梯度修剪算法，稀疏训练数据流和加速器架构。通过在每层施加随机修剪算法，可以大大提高后传播梯度的稀疏性，而不会降低训练的准确性和收敛速度。此外，要同时利用\ textIt {自然稀疏}（由relu或汇总层）和\ textit {人工稀疏}（由修剪算法带来），提出了一种稀疏感知体系结构用于训练加速。该体系结构通过采用一维卷积数据流来支持CNN的向前和后传播。我们已经构建了一个简单的编译器，以将CNNS拓扑映射到\ textit {sparsetrain}和一个周期精确的体系结构模拟器，以使用14nm $ finfet Technologies来评估基于合成设计的性能和效率。 Alexnet/Resnet上的评估结果表明，与原始培训过程相比，\ textit {sparsetrain}可以达到$ 2.7 \ times $ speedup和$ 2.2 \ times $ $ $ speedup。

Training Convolutional Neural Networks (CNNs) usually requires a large number of computational resources. In this paper, \textit{SparseTrain} is proposed to accelerate CNN training by fully exploiting the sparsity. It mainly involves three levels of innovations: activation gradients pruning algorithm, sparse training dataflow, and accelerator architecture. By applying a stochastic pruning algorithm on each layer, the sparsity of back-propagation gradients can be increased dramatically without degrading training accuracy and convergence rate. Moreover, to utilize both \textit{natural sparsity} (resulted from ReLU or Pooling layers) and \textit{artificial sparsity} (brought by pruning algorithm), a sparse-aware architecture is proposed for training acceleration. This architecture supports forward and back-propagation of CNN by adopting 1-Dimensional convolution dataflow. We have built %a simple compiler to map CNNs topology onto \textit{SparseTrain}, and a cycle-accurate architecture simulator to evaluate the performance and efficiency based on the synthesized design with $14nm$ FinFET technologies. Evaluation results on AlexNet/ResNet show that \textit{SparseTrain} could achieve about $2.7 \times$ speedup and $2.2 \times$ energy efficiency improvement on average compared with the original training process.

下载PDF全文

下载文献需遵守相关版权规定

论文标题