论文标题

垫网:动态网络的有效框架

PAD-Net: An Efficient Framework for Dynamic Networks

论文作者

He, Shwai, Ding, Liang, Dong, Daize, Liu, Boan, Yu, Fuqiang, Tao, Dacheng

论文摘要

动态网络,例如动态卷积(DY-CONV)和专家(MOE)的混合物(MOE)的混合物已经进行了广泛的探索,因为它们可以通过可接受的计算成本大大提高模型的表示能力。实现动态网络的常见做法是将给定的静态层转换为完全动态的层,其中所有参数都是动态的(至少在单层内),并且随输入而变化。但是,这样的完全动态的设置可能会导致冗余参数和高部署成本,从而限制了动态网络对更广泛的任务和模型的适用性。我们工作的主要贡献是在动态网络中挑战基本常识,并提出部分动态网络,即pad-net,以将冗余动态参数转换为静态参数。此外,我们进一步将迭代模式分区设计为有效分区动态和静态参数。我们的方法在图像分类和胶水基准上都具有两个典型的高级动态体系结构,即Dy-Conv和Moe的大规模实验,即Dy-Conv和Moe。令人鼓舞的是,我们超过了全动态网络的$+0.7 \%$ top-1 acc,仅$ 30 \%$ $ $ $ $ $+1.9 \%$+++%$ $+%$的语言理解得分,仅$ 50 \%\%$ $ $ $ $ $ $ $ $ $。代码将在:\ url {https://github.com/shwai-he/pad-net}上发布。

Dynamic networks, e.g., Dynamic Convolution (DY-Conv) and the Mixture of Experts (MoE), have been extensively explored as they can considerably improve the model's representation power with acceptable computational cost. The common practice in implementing dynamic networks is to convert the given static layers into fully dynamic ones where all parameters are dynamic (at least within a single layer) and vary with the input. However, such a fully dynamic setting may cause redundant parameters and high deployment costs, limiting the applicability of dynamic networks to a broader range of tasks and models. The main contributions of our work are challenging the basic commonsense in dynamic networks and proposing a partially dynamic network, namely PAD-Net, to transform the redundant dynamic parameters into static ones. Also, we further design Iterative Mode Partition to partition dynamic and static parameters efficiently. Our method is comprehensively supported by large-scale experiments with two typical advanced dynamic architectures, i.e., DY-Conv and MoE, on both image classification and GLUE benchmarks. Encouragingly, we surpass the fully dynamic networks by $+0.7\%$ top-1 acc with only $30\%$ dynamic parameters for ResNet-50 and $+1.9\%$ average score in language understanding with only $50\%$ dynamic parameters for BERT. Code will be released at: \url{https://github.com/Shwai-He/PAD-Net}.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源