论文标题
揭示针对符合规范的对抗例子的对抗训练的极限
Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples
论文作者
论文摘要
对抗性训练及其变体已成为学习强大的深神经网络的事实上的标准。在本文中,我们探讨了围绕对抗训练的景观,以发现其极限。我们系统地研究了不同训练损失,模型大小,激活功能,添加未标记的数据(通过伪标记)和其他因素对对抗性鲁棒性的影响。我们发现,可以通过结合较大的模型,Swish/Silu激活和平均模型的重量来训练可靠的模型,这些模型远远超出了最先进的结果。 We demonstrate large improvements on CIFAR-10 and CIFAR-100 against $\ell_\infty$ and $\ell_2$ norm-bounded perturbations of size $8/255$ and $128/255$, respectively.在使用其他未标记数据的设置中,我们在CIFAR-10上的$ 8/255 $ $ 8/255 $的$ \ ell_ \ elfty $扰动(相对于先前的ART方面+6.35%),我们获得了65.88%的准确性。没有其他数据,我们获得了57.20%(+3.46%)的精度。为了测试我们的发现的普遍性,没有任何其他修改,我们可以在CIFAR-10上获得80.53%(+7.62%)的精度,而CIFAR-10 $ 128/255 $ $ 128/255 $,以及36.88%(+8.46%)的精度,而对于$ \ ell_ \ ell_ \ ell_ \ ferty $ \ ferty $ perturbations of 36.88%(+8.46%)。所有模型均可在https://github.com/deepmind/deepmind-research/tree/master/Adversarial_robustness获得。
Adversarial training and its variants have become de facto standards for learning robust deep neural networks. In this paper, we explore the landscape around adversarial training in a bid to uncover its limits. We systematically study the effect of different training losses, model sizes, activation functions, the addition of unlabeled data (through pseudo-labeling) and other factors on adversarial robustness. We discover that it is possible to train robust models that go well beyond state-of-the-art results by combining larger models, Swish/SiLU activations and model weight averaging. We demonstrate large improvements on CIFAR-10 and CIFAR-100 against $\ell_\infty$ and $\ell_2$ norm-bounded perturbations of size $8/255$ and $128/255$, respectively. In the setting with additional unlabeled data, we obtain an accuracy under attack of 65.88% against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-10 (+6.35% with respect to prior art). Without additional data, we obtain an accuracy under attack of 57.20% (+3.46%). To test the generality of our findings and without any additional modifications, we obtain an accuracy under attack of 80.53% (+7.62%) against $\ell_2$ perturbations of size $128/255$ on CIFAR-10, and of 36.88% (+8.46%) against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-100. All models are available at https://github.com/deepmind/deepmind-research/tree/master/adversarial_robustness.