如何鲁棒化黑盒ML模型？零订单优化视角

论文标题

如何鲁棒化黑盒ML模型？零订单优化视角

How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

论文作者

Zhang, Yimeng, Yao, Yuguang, Jia, Jinghan, Yi, Jinfeng, Hong, Mingyi, Chang, Shiyu, Liu, Sijia

论文摘要

缺乏对抗性鲁棒性已被认为是最先进的机器学习（ML）模型，例如深神经网络（DNNS）的重要问题。因此，针对对抗攻击的强大ML模型现在是研究的重点。但是，几乎所有现有的防御方法，尤其是在强大的训练中，都是白框假设，即防御者可以访问ML模型的详细信息（或其替代替代方案（如果有）），例如其体系结构和参数。除了现有作品之外，在本文中，我们旨在解决Black-Box Defence的问题：如何使用仅输入查询和输出反馈来鲁棒化黑框模型？在实际情况下出现了这样的问题，在这种情况下，预测模型的所有者不愿共享模型信息以保护隐私。为此，我们提出了一个可以应用于黑盒模型的防御性操作的一般概念，并通过一阶（FO）认证的防御技术的DeNoed Smoothing（DS）镜头进行设计。为了允许仅使用模型查询设计，我们将DS与零阶（无梯度）优化进行了整合。但是，直接实施零阶（ZO）优化会遭受较高的梯度估计差异，从而导致防御无效。为了解决这个问题，我们接下来建议将自动编码器（AE）预先介绍给给定的（Black-Box）模型，以便可以使用差异降低的ZO优化对DS进行培训。我们将最终的防御称为ZO-AE-DS。实际上，我们从经验上表明，Zo-ae-ds可以提高准确性，认证的鲁棒性和对现有基准的查询复杂性。在图像分类和图像重建任务下，我们的方法的有效性是合理的。代码可在https://github.com/damon-demon/black-box-defense上找到。

The lack of adversarial robustness has been recognized as an important issue for state-of-the-art machine learning (ML) models, e.g., deep neural networks (DNNs). Thereby, robustifying ML models against adversarial attacks is now a major focus of research. However, nearly all existing defense methods, particularly for robust training, made the white-box assumption that the defender has the access to the details of an ML model (or its surrogate alternatives if available), e.g., its architectures and parameters. Beyond existing works, in this paper we aim to address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback? Such a problem arises in practical scenarios, where the owner of the predictive model is reluctant to share model information in order to preserve privacy. To this end, we propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS), a first-order (FO) certified defense technique. To allow the design of merely using model queries, we further integrate DS with the zeroth-order (gradient-free) optimization. However, a direct implementation of zeroth-order (ZO) optimization suffers a high variance of gradient estimates, and thus leads to ineffective defense. To tackle this problem, we next propose to prepend an autoencoder (AE) to a given (black-box) model so that DS can be trained using variance-reduced ZO optimization. We term the eventual defense as ZO-AE-DS. In practice, we empirically show that ZO-AE- DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines. And the effectiveness of our approach is justified under both image classification and image reconstruction tasks. Codes are available at https://github.com/damon-demon/Black-Box-Defense.

下载PDF全文

下载文献需遵守相关版权规定

论文标题