论文标题
集合知识指导的子网络搜索和微调用于过滤器修剪
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning
论文作者
论文摘要
常规的基于NAS的修剪算法旨在找到具有最佳验证性能的子网络。但是,验证性能并不能成功代表测试性能,即潜在性能。另外,尽管对修剪的网络进行微调以恢复性能下降是不可避免的过程,但很少有研究解决了这个问题。本文提供了一种新颖的集合知识指导(EKG),以一次解决这两个问题。首先,我们在实验上证明损失格局的波动可以是评估潜在性能的有效指标。为了以低成本搜索具有最平稳损失景观的子网络,我们采用心电图作为搜索奖励。用于以下搜索迭代的心电图由临时子网络的集合知识,即子网络评估的副产品组成。接下来,我们重复使用心电图,以在修剪修剪的网络时为修剪的网络提供温和的信息指导。由于在两个阶段都将心电图作为内存库实施,因此需要可忽略的成本。例如,当修剪和训练Resnet-50时,只需315 GPU小时即可删除大约45.04%的拖鞋而没有任何性能降解,即使在低规格的工作站上也可以运行。实施的代码可在https://github.com/sseung0703/ekg上找到。
Conventional NAS-based pruning algorithms aim to find the sub-network with the best validation performance. However, validation performance does not successfully represent test performance, i.e., potential performance. Also, although fine-tuning the pruned network to restore the performance drop is an inevitable process, few studies have handled this issue. This paper provides a novel Ensemble Knowledge Guidance (EKG) to solve both problems at once. First, we experimentally prove that the fluctuation of loss landscape can be an effective metric to evaluate the potential performance. In order to search a sub-network with the smoothest loss landscape at a low cost, we employ EKG as a search reward. EKG utilized for the following search iteration is composed of the ensemble knowledge of interim sub-networks, i.e., the by-products of the sub-network evaluation. Next, we reuse EKG to provide a gentle and informative guidance to the pruned network while fine-tuning the pruned network. Since EKG is implemented as a memory bank in both phases, it requires a negligible cost. For example, when pruning and training ResNet-50, just 315 GPU hours are required to remove around 45.04% of FLOPS without any performance degradation, which can operate even on a low-spec workstation. the implemented code is available at https://github.com/sseung0703/EKG.