正式的稀疏网络调查员

论文标题

正式的稀疏网络调查员

Amenable Sparse Network Investigator

论文作者

Damadi, Saeed, Nouri, Erfan, Pirsiavash, Hamed

论文摘要

我们介绍了“可熟悉的稀疏网络研究者”（ASNI）算法，该算法利用基于Sigmoid功能的新型修剪策略，该策略在一轮训练的过程中在全球范围内引起稀疏度。 ASNI算法完成了当前最新策略只能执行其中之一的这两个任务。 ASNI算法有两个亚偏算：1）ASNI-I，2）ASNI-II。 ASNI-I仅在一轮培训中学习一个准确的稀疏现成网络。 ASNI-II学习了一个稀疏的网络和一个量化，压缩的初始化，并且可以从中进行稀疏网络。量化学习的初始化是因为仅学习了两个数字以在每个层L中的非零参数初始化。因此，整个网络初始化的量化水平为2L。同样，学习的初始化被压缩，因为它是由2L数字组成的集合。可以通过这种量化和压缩初始化进行训练的特殊稀疏网络称为Ansenable。据我们所知，没有其他算法可以学习量化和压缩的初始化，该算法仍然可以训练，并且能够解决这两个修剪任务。我们的数值实验表明，有一个量化和压缩的初始化，可以从中训练学习的稀疏网络并达到与密集版本相同的准确性。我们从实验上表明，这些2L的量化水平是ASNI-I通过ASNI-I的每一层中参数的浓度点。为了证实上述内容，我们使用了一系列实验，该实验利用了ImageNet，CIFAR10和MNIST数据集上的Resnet，VGG风格，小卷积和完全连接的网络。

We present "Amenable Sparse Network Investigator" (ASNI) algorithm that utilizes a novel pruning strategy based on a sigmoid function that induces sparsity level globally over the course of one single round of training. The ASNI algorithm fulfills both tasks that current state-of-the-art strategies can only do one of them. The ASNI algorithm has two subalgorithms: 1) ASNI-I, 2) ASNI-II. ASNI-I learns an accurate sparse off-the-shelf network only in one single round of training. ASNI-II learns a sparse network and an initialization that is quantized, compressed, and from which the sparse network is trainable. The learned initialization is quantized since only two numbers are learned for initialization of nonzero parameters in each layer L. Thus, quantization levels for the initialization of the entire network is 2L. Also, the learned initialization is compressed because it is a set consisting of 2L numbers. The special sparse network that can be trained from such a quantized and compressed initialization is called amenable. To the best of our knowledge, there is no other algorithm that can learn a quantized and compressed initialization from which the network is still trainable and is able to solve both pruning tasks. Our numerical experiments show that there is a quantized and compressed initialization from which the learned sparse network can be trained and reach to an accuracy on a par with the dense version. We experimentally show that these 2L levels of quantization are concentration points of parameters in each layer of the learned sparse network by ASNI-I. To corroborate the above, we have performed a series of experiments utilizing networks such as ResNets, VGG-style, small convolutional, and fully connected ones on ImageNet, CIFAR10, and MNIST datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题