模拟未知目标模型以进行查询有效的黑盒攻击

论文标题

模拟未知目标模型以进行查询有效的黑盒攻击

Simulating Unknown Target Models for Query-Efficient Black-box Attacks

论文作者

Ma, Chen, Chen, Li, Yong, Jun-Hai

论文摘要

已经提出了许多对抗性攻击，以调查深神经网络的安全问题。在黑框设置中，当前模型窃取攻击训练替代模型以伪造目标模型的功能。但是，培训需要查询目标模型。因此，查询复杂性保持较高，并且可以轻松防御此类攻击。这项研究旨在训练称为“模拟器”的广义替代模型，该模型可以模仿任何未知目标模型的功能。为此，我们通过收集在各种现有网络的攻击过程中生成的查询序列来构建培训数据。学习过程在元学习中使用基于卑鄙的基于卑鄙的错误知识降低损失，以最大程度地减少模拟器和采样网络之间的差异。然后，从多个任务中计算并积累了此损失的元梯度，以更新模拟器并随后改善概括。当攻击在训练中看不见的目标模型时，训练有素的模拟器可以使用有限的反馈来准确模拟其功能。结果，可以将很大一部分查询转移到模拟器上，从而降低查询复杂性。使用CIFAR-10，CIFAR-100和Tinyimagenet数据集进行的综合实验的结果表明，与基线方法相比，所提出的方法可将查询复杂性降低多个数量级。实现源代码将在https://github.com/machanic/simulatoratakt上发布。

Many adversarial attacks have been proposed to investigate the security issues of deep neural networks. In the black-box setting, current model stealing attacks train a substitute model to counterfeit the functionality of the target model. However, the training requires querying the target model. Consequently, the query complexity remains high, and such attacks can be defended easily. This study aims to train a generalized substitute model called "Simulator", which can mimic the functionality of any unknown target model. To this end, we build the training data with the form of multiple tasks by collecting query sequences generated during the attacks of various existing networks. The learning process uses a mean square error-based knowledge-distillation loss in the meta-learning to minimize the difference between the Simulator and the sampled networks. The meta-gradients of this loss are then computed and accumulated from multiple tasks to update the Simulator and subsequently improve generalization. When attacking a target model that is unseen in training, the trained Simulator can accurately simulate its functionality using its limited feedback. As a result, a large fraction of queries can be transferred to the Simulator, thereby reducing query complexity. Results of the comprehensive experiments conducted using the CIFAR-10, CIFAR-100, and TinyImageNet datasets demonstrate that the proposed approach reduces query complexity by several orders of magnitude compared to the baseline method. The implementation source code is released at https://github.com/machanic/SimulatorAttack.

下载PDF全文

下载文献需遵守相关版权规定

论文标题