论文标题
通过正规化梯度来提高对对抗变形的抵抗力
Improving Resistance to Adversarial Deformations by Regularizing Gradients
论文作者
论文摘要
改善深神经网络对对抗性攻击的阻力对于将模型部署到现实的应用程序非常重要。但是,大多数防御方法旨在防御强度扰动并忽略位置扰动,这对于深层模型安全性同样重要。在本文中,我们专注于对抗变形,这是一类典型的位置扰动,并提出了流量梯度正则化以提高模型的电阻。从理论上讲,我们证明,与输入梯度正则化相比,正规化流量梯度能够获得更严格的界限。在多个数据集,架构和对抗性变形上,我们的经验结果表明,经过流动梯度训练的模型比用具有较大边距的输入梯度训练的模型可以获得更好的阻力,并且比对抗性训练更好。此外,与对抗变形直接训练相比,我们的方法可以在看不见的攻击中获得更好的结果,并且结合这两种方法可以进一步提高阻力。
Improving the resistance of deep neural networks against adversarial attacks is important for deploying models to realistic applications. However, most defense methods are designed to defend against intensity perturbations and ignore location perturbations, which should be equally important for deep model security. In this paper, we focus on adversarial deformations, a typical class of location perturbations, and propose a flow gradient regularization to improve the resistance of models. Theoretically, we prove that, compared with input gradient regularization, regularizing flow gradients is able to get a tighter bound. Over multiple datasets, architectures, and adversarial deformations, our empirical results indicate that models trained with flow gradients can acquire a better resistance than trained with input gradients with a large margin, and also better than adversarial training. Moreover, compared with directly training with adversarial deformations, our method can achieve better results in unseen attacks, and combining these two methods can improve the resistance further.