从图像和扰动的相互影响中了解对抗性示例

论文标题

从图像和扰动的相互影响中了解对抗性示例

Understanding Adversarial Examples from the Mutual Influence of Images and Perturbations

论文作者

Zhang, Chaoning, Benz, Philipp, Imtiaz, Tooba, Kweon, In-So

论文摘要

各种各样的作品探讨了存在对抗性例子的原因，但是解释没有共识。我们建议将DNN逻辑视为特征表示的向量，并利用它们来分析基于Pearson相关系数（PCC）的两个独立输入的相互影响。我们利用此矢量表示来理解对抗性示例，通过解开干净的图像和对抗性扰动，并彼此分析它们的影响。我们的结果提出了对图像与通用扰动之间关系的新观点：普遍的扰动包含主要特征，图像对它们的噪音就像噪音一样。此功能透视图导致了一种新方法，用于使用随机源图像生成目标的通用对抗扰动。我们是第一个在不利用原始培训数据的情况下实现有针对性的通用攻击的具有挑战性的任务的人。我们使用代理数据集的方法与使用原始培训数据集的最先进的基线相当。

A wide variety of works have explored the reason for the existence of adversarial examples, but there is no consensus on the explanation. We propose to treat the DNN logits as a vector for feature representation, and exploit them to analyze the mutual influence of two independent inputs based on the Pearson correlation coefficient (PCC). We utilize this vector representation to understand adversarial examples by disentangling the clean images and adversarial perturbations, and analyze their influence on each other. Our results suggest a new perspective towards the relationship between images and universal perturbations: Universal perturbations contain dominant features, and images behave like noise to them. This feature perspective leads to a new method for generating targeted universal adversarial perturbations using random source images. We are the first to achieve the challenging task of a targeted universal attack without utilizing original training data. Our approach using a proxy dataset achieves comparable performance to the state-of-the-art baselines which utilize the original training dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题