论文标题
痕迹对抗示例
Trace-Norm Adversarial Examples
论文作者
论文摘要
通过迭代优化算法,寻求白盒对抗扰动,最常见地使原始图像的$ L_P $社区(所谓的失真集)上的对抗性损失最小化。用不同的规范限制对抗性搜索会导致结构化的对抗性例子。在这里,我们探索具有结构增强算法的几个失真集。这些新的对抗性示例的新结构,但在优化方面普遍存在,这是对对抗性理论认证的挑战,它再次仅提供$ L_P $证书。由于对抗性鲁棒性仍然是一个经验领域,因此还应合理地对不同结构化攻击进行合理评估防御机制。此外,这些结构化的对抗扰动可能会使扭曲大小比其$ l_p $ ounterpart更大的扭曲尺寸,同时保持不可察觉或视为图像的自然轻微扭曲。最后,它们允许对对抗性扰动的产生进行一些控制,例如(本地化)的模糊性。
White box adversarial perturbations are sought via iterative optimization algorithms most often minimizing an adversarial loss on a $l_p$ neighborhood of the original image, the so-called distortion set. Constraining the adversarial search with different norms results in disparately structured adversarial examples. Here we explore several distortion sets with structure-enhancing algorithms. These new structures for adversarial examples, yet pervasive in optimization, are for instance a challenge for adversarial theoretical certification which again provides only $l_p$ certificates. Because adversarial robustness is still an empirical field, defense mechanisms should also reasonably be evaluated against differently structured attacks. Besides, these structured adversarial perturbations may allow for larger distortions size than their $l_p$ counter-part while remaining imperceptible or perceptible as natural slight distortions of the image. Finally, they allow some control on the generation of the adversarial perturbation, like (localized) bluriness.