论文标题

解构稀疏神经网络的结构

Deconstructing the Structure of Sparse Neural Networks

论文作者

Van Gelder, Maxwell, Wortsman, Mitchell, Ehsani, Kiana

论文摘要

尽管已经对稀疏的神经网络进行了广泛的研究,但重点主要放在准确性上。在这项工作中,我们将重点放在网络结构上,并分析三种流行算法。当结构持续存在并重置为不同的随机初始化时,我们首先测量性能,从而扩展了解构彩票票的实验(Zhou等,2019)。该实验表明,精度可以单独从结构中得出。其次,为了测量结构鲁棒性,我们研究了稀疏神经网络在训练后进一步修剪的灵敏度,发现算法之间的对比度很明显。最后,对于最近的动态稀疏算法,我们研究了该结构的早期出现。我们发现,即使在一个时期之后,结构也大多确定,使我们能够提出一种更有效的算法,该算法在整个训练过程中不需要密集的梯度。在回顾稀疏神经网络的算法并分析其从不同的镜头分析其性能时,我们发现了一些有趣的属性和有希望的未来研究方向。

Although sparse neural networks have been studied extensively, the focus has been primarily on accuracy. In this work, we focus instead on network structure, and analyze three popular algorithms. We first measure performance when structure persists and weights are reset to a different random initialization, thereby extending experiments in Deconstructing Lottery Tickets (Zhou et al., 2019). This experiment reveals that accuracy can be derived from structure alone. Second, to measure structural robustness we investigate the sensitivity of sparse neural networks to further pruning after training, finding a stark contrast between algorithms. Finally, for a recent dynamic sparsity algorithm we investigate how early in training the structure emerges. We find that even after one epoch the structure is mostly determined, allowing us to propose a more efficient algorithm which does not require dense gradients throughout training. In looking back at algorithms for sparse neural networks and analyzing their performance from a different lens, we uncover several interesting properties and promising directions for future research.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源