寻找保持对抗性获胜门票的动态

论文标题

寻找保持对抗性获胜门票的动态

Finding Dynamics Preserving Adversarial Winning Tickets

论文作者

Shi, Xupeng, Zheng, Pengfei, Ding, A. Adam, Gao, Yuan, Zhang, Weizhong

论文摘要

现代深层神经网络（DNNS）容易受到对抗性攻击的影响，并且已经证明对抗性训练是改善DNN的对抗性鲁棒性的有前途的方法。在对抗性环境中已经考虑了修剪方法，以降低模型能力并同时提高训练中的对抗性鲁棒性。现有的对抗修剪方法通常模仿了自然训练的经典修剪方法，这些方法遵循三阶段的“训练 - 培训 - 填充”管道。我们观察到，这种修剪方法不一定保留密集网络的动力学，因此很难进行微调以补偿修剪的准确性降解。基于\ textit {神经切线核}（NTK）的最新作品，我们系统地研究了对抗训练的动态，并证明了在初始化时可以训练可训练的稀疏子网络，从而可以接受训练以从划痕中进行对抗性鲁棒。从理论上讲，此理论上验证了\ textit {彩票票假设}在对抗上下文中，我们将子网络结构称为\ textit {对抗性获奖票}（awt）。我们还展示了AWT保留对抗训练的动态并获得同等表现的经验证据，作为密集的对抗训练。

Modern deep neural networks (DNNs) are vulnerable to adversarial attacks and adversarial training has been shown to be a promising method for improving the adversarial robustness of DNNs. Pruning methods have been considered in adversarial context to reduce model capacity and improve adversarial robustness simultaneously in training. Existing adversarial pruning methods generally mimic the classical pruning methods for natural training, which follow the three-stage 'training-pruning-fine-tuning' pipelines. We observe that such pruning methods do not necessarily preserve the dynamics of dense networks, making it potentially hard to be fine-tuned to compensate the accuracy degradation in pruning. Based on recent works of \textit{Neural Tangent Kernel} (NTK), we systematically study the dynamics of adversarial training and prove the existence of trainable sparse sub-network at initialization which can be trained to be adversarial robust from scratch. This theoretically verifies the \textit{lottery ticket hypothesis} in adversarial context and we refer such sub-network structure as \textit{Adversarial Winning Ticket} (AWT). We also show empirical evidences that AWT preserves the dynamics of adversarial training and achieve equal performance as dense adversarial training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题