对抗像素恢复是可转移扰动的借口

论文标题

对抗像素恢复是可转移扰动的借口

Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations

论文作者

Malik, Hashmat Shadab, Kunhimon, Shahina K, Naseer, Muzammal, Khan, Salman, Khan, Fahad Shahbaz

论文摘要

可转移的对抗攻击优化了从预处理的替代模型和已知标签空间的对手，以欺骗未知的黑盒模型。因此，这些攻击受到有效的替代模型的可用性受到限制。在这项工作中，我们放松了这一假设，并提出了对抗像素的恢复，作为一种自制的替代方案，可以在没有标签和很少的数据样本的条件下从头开始训练有效的替代模型。我们的训练方法是基于最小最大方案，该方案通过对抗目标减少过度拟合，从而为更具概括的替代模型进行了优化。我们提出的攻击是对对抗性像素恢复的补充，并且独立于任何特定任务目标，因为它可以以自我监督的方式启动。我们成功地证明了我们对视觉变压器的对抗性可传递性以及分类，对象检测和视频细分任务的卷积神经网络。我们的训练方法将基线无监督训练方法的转移性提高了16.4％。放。我们的代码和预训练的代理模型可在以下网址找到：https：//github.com/hashmatshadab/apr

Transferable adversarial attacks optimize adversaries from a pretrained surrogate model and known label space to fool the unknown black-box models. Therefore, these attacks are restricted by the availability of an effective surrogate model. In this work, we relax this assumption and propose Adversarial Pixel Restoration as a self-supervised alternative to train an effective surrogate model from scratch under the condition of no labels and few data samples. Our training approach is based on a min-max scheme which reduces overfitting via an adversarial objective and thus optimizes for a more generalizable surrogate model. Our proposed attack is complimentary to the adversarial pixel restoration and is independent of any task specific objective as it can be launched in a self-supervised manner. We successfully demonstrate the adversarial transferability of our approach to Vision Transformers as well as Convolutional Neural Networks for the tasks of classification, object detection, and video segmentation. Our training approach improves the transferability of the baseline unsupervised training method by 16.4% on ImageNet val. set. Our codes & pre-trained surrogate models are available at: https://github.com/HashmatShadab/APR

下载PDF全文

下载文献需遵守相关版权规定

论文标题