论文标题

通过特征随机化抵抗深度学习模型,以防止对抗性攻击的传递性

Resisting Deep Learning Models Against Adversarial Attack Transferability via Feature Randomization

论文作者

Nowroozi, Ehsan, Mohammadi, Mohammadreza, Golmohammadi, Pargol, Mekdad, Yassine, Conti, Mauro, Uluagac, Selcuk

论文摘要

在过去的几十年中,人工智能的兴起使我们有能力解决日常生活中最具挑战性的问题,例如癌症的预测和自主航行。但是,如果不保护对抗攻击,这些应用程序可能不会可靠。此外,最近的作品表明,某些对抗性示例可以在不同模型中转移。因此,通过抵抗对抗性操纵的强大模型避免这种可传递性至关重要。在本文中,我们提出了一种基于特征随机化的方法,该方法抵抗了八次针对测试阶段深度学习模型的对抗性攻击。我们的新方法包括改变目标网络分类器中的训练策略并选择随机特征样本。我们认为攻击者具有有限的知识和半知识条件,以进行最普遍的对抗性攻击。我们使用包括现实和合成攻击的众所周知的UNSW-NB15数据集评估了方法的鲁棒性。之后,我们证明我们的策略表现优于现有的最新方法,例如最强大的攻击,该方法包括针对特定的对抗性攻击进行微调网络模型。最后,我们的实验结果表明,我们的方法可以确保目标网络并抵抗对抗性攻击转移性的60%以上。

In the past decades, the rise of artificial intelligence has given us the capabilities to solve the most challenging problems in our day-to-day lives, such as cancer prediction and autonomous navigation. However, these applications might not be reliable if not secured against adversarial attacks. In addition, recent works demonstrated that some adversarial examples are transferable across different models. Therefore, it is crucial to avoid such transferability via robust models that resist adversarial manipulations. In this paper, we propose a feature randomization-based approach that resists eight adversarial attacks targeting deep learning models in the testing phase. Our novel approach consists of changing the training strategy in the target network classifier and selecting random feature samples. We consider the attacker with a Limited-Knowledge and Semi-Knowledge conditions to undertake the most prevalent types of adversarial attacks. We evaluate the robustness of our approach using the well-known UNSW-NB15 datasets that include realistic and synthetic attacks. Afterward, we demonstrate that our strategy outperforms the existing state-of-the-art approach, such as the Most Powerful Attack, which consists of fine-tuning the network model against specific adversarial attacks. Finally, our experimental results show that our methodology can secure the target network and resists adversarial attack transferability by over 60%.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源