欺骗性攻击欺骗深度神经网络

论文标题

欺骗性攻击欺骗深度神经网络

Patch-wise Attack for Fooling Deep Neural Network

论文作者

Gao, Lianli, Zhang, Qilong, Song, Jingkuan, Liu, Xianglong, Shen, Heng Tao

论文摘要

通过在清洁图像中添加人类侵蚀的噪声，由此产生的对抗性示例可以欺骗其他未知模型。由深神经网络（DNN）提取的像素的特征受周围区域的影响，并且不同的DNN通常集中在识别中的不同歧视区域上。在此激励的情况下，我们提出了一种通过贴片的迭代算法 - 对主流训练和防御模型的黑盒攻击与操纵像素噪声的现有攻击方法不同。这样，在不牺牲白盒攻击的性能的情况下，我们的对抗性示例可以具有很强的转移性。具体来说，我们将一个放大因子引入了每次迭代的步长，并且一个像素的整体梯度溢出了$ε$ -constraint被项目内核正确分配给其周围区域。我们的方法通常可以集成到任何基于梯度的攻击方法中。与当前的最新攻击相比，国防模型的成功率显着提高了9.2 \％，平均训练有素的模型为3.7 \％。我们的代码可在\ url {https://github.com/qilong-zhang/patch-wise-iterative-ittack}中获得

By adding human-imperceptible noise to clean images, the resultant adversarial examples can fool other unknown models. Features of a pixel extracted by deep neural networks (DNNs) are influenced by its surrounding regions, and different DNNs generally focus on different discriminative regions in recognition. Motivated by this, we propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models, which differs from the existing attack methods manipulating pixel-wise noise. In this way, without sacrificing the performance of white-box attack, our adversarial examples can have strong transferability. Specifically, we introduce an amplification factor to the step size in each iteration, and one pixel's overall gradient overflowing the $ε$-constraint is properly assigned to its surrounding regions by a project kernel. Our method can be generally integrated to any gradient-based attack methods. Compared with the current state-of-the-art attacks, we significantly improve the success rate by 9.2\% for defense models and 3.7\% for normally trained models on average. Our code is available at \url{https://github.com/qilong-zhang/Patch-wise-iterative-attack}

下载PDF全文

下载文献需遵守相关版权规定

论文标题