对抗防御的经验评论

论文标题

对抗防御的经验评论

An Empirical Review of Adversarial Defenses

论文作者

Goel, Ayush

论文摘要

从手机中安装的面部识别系统到自动驾驶汽车，AI领域正在见证快速转变，并以令人难以置信的速度融入了我们的日常生活中。这些系统的预测中的任何重大故障都可能是毁灭性的，泄漏敏感的信息甚至使生命造成的（如自动驾驶汽车）。但是，构成此类系统基础的深层神经网络非常容易受到特定类型的攻击，称为对抗性攻击。即使使用最低限度的计算，黑客也可以生成对抗性示例（属于另一类的图像或数据点，但始终欺骗模型以将其错误分类为真实），并崩溃了此类算法的基础。在本文中，我们编译和测试了许多防御这种对抗性攻击的方法。在探索的过程中，我们发现了两种有效的技术，即辍学和降级自动编码器，并在防止此类攻击欺骗模型方面取得了成功。我们证明，这些技术也对较高的噪声水平以及不同种类的对抗性攻击（尽管未针对所有人进行测试）具有抵抗力。我们还开发了一个框架，以根据深神经网络的应用和资源约束的性质来决定适合攻击的防御技术。

From face recognition systems installed in phones to self-driving cars, the field of AI is witnessing rapid transformations and is being integrated into our everyday lives at an incredible pace. Any major failure in these system's predictions could be devastating, leaking sensitive information or even costing lives (as in the case of self-driving cars). However, deep neural networks, which form the basis of such systems, are highly susceptible to a specific type of attack, called adversarial attacks. A hacker can, even with bare minimum computation, generate adversarial examples (images or data points that belong to another class, but consistently fool the model to get misclassified as genuine) and crumble the basis of such algorithms. In this paper, we compile and test numerous approaches to defend against such adversarial attacks. Out of the ones explored, we found two effective techniques, namely Dropout and Denoising Autoencoders, and show their success in preventing such attacks from fooling the model. We demonstrate that these techniques are also resistant to both higher noise levels as well as different kinds of adversarial attacks (although not tested against all). We also develop a framework for deciding the suitable defense technique to use against attacks, based on the nature of the application and resource constraints of the Deep Neural Network.

下载PDF全文

下载文献需遵守相关版权规定

论文标题