论文标题
神经建筑后门
Neural Architectural Backdoors
论文作者
论文摘要
本文提出了一个有趣的问题:是否可以利用神经体系结构搜索(NAS)作为新的攻击向量来发起以前不可能的攻击?具体来说,我们提出了EVA,这是一种新的攻击,利用NAS找到具有固有后门的神经体系结构,并使用输入感知触发器来利用这种脆弱性。与现有攻击相比,EVA展示了许多有趣的属性:(i)它不需要污染训练数据或扰动模型参数; (ii)不可知到下游微调甚至从头开始重新训练; (iii)自然会逃避依靠检查模型参数或培训数据的防御措施。通过对基准数据集进行广泛的评估,我们表明EVA具有高回避性,可传递性和鲁棒性,从而扩大了对手的设计范围。我们进一步表征了EVA的机制,这些机制可能可以通过识别触发模式的架构级``快捷方式''可以解释。这项工作引起了人们对当前NA的实践的担忧,并指出了发展有效对策的潜在方向。
This paper asks the intriguing question: is it possible to exploit neural architecture search (NAS) as a new attack vector to launch previously improbable attacks? Specifically, we present EVAS, a new attack that leverages NAS to find neural architectures with inherent backdoors and exploits such vulnerability using input-aware triggers. Compared with existing attacks, EVAS demonstrates many interesting properties: (i) it does not require polluting training data or perturbing model parameters; (ii) it is agnostic to downstream fine-tuning or even re-training from scratch; (iii) it naturally evades defenses that rely on inspecting model parameters or training data. With extensive evaluation on benchmark datasets, we show that EVAS features high evasiveness, transferability, and robustness, thereby expanding the adversary's design spectrum. We further characterize the mechanisms underlying EVAS, which are possibly explainable by architecture-level ``shortcuts'' that recognize trigger patterns. This work raises concerns about the current practice of NAS and points to potential directions to develop effective countermeasures.