论文标题
平滑的平滑:可证明的验证分类器的防御
Denoised Smoothing: A Provable Defense for Pretrained Classifiers
论文作者
论文摘要
我们提出了一种方法,可证明对任何验证的图像分类器抗拒$ \ ell_p $对抗性攻击。例如,这种方法允许公共视觉API提供商和用户无缝将预验证的非运动分类服务转换为可证明的强大的分类服务。通过将自定义训练的DeNoiser准备到任何现成的映像分类器并使用随机平滑,我们有效地创建了一个新的分类器,保证在不修改经过预处理的分类器的情况下为对抗性示例为$ \ ell_p $ - 抛光。我们的方法适用于预验证的分类器的白色框和黑框设置。我们将这种辩护称为剥皮的平滑,我们通过对ImageNet和Cifar-10进行广泛的实验来证明其有效性。最后,我们使用我们的方法来捍卫Azure,Google,AWS和Clarifai图像分类API。我们的代码复制论文中的所有实验,请访问:https://github.com/microsoft/denoised-smoothing。
We present a method for provably defending any pretrained image classifier against $\ell_p$ adversarial attacks. This method, for instance, allows public vision API providers and users to seamlessly convert pretrained non-robust classification services into provably robust ones. By prepending a custom-trained denoiser to any off-the-shelf image classifier and using randomized smoothing, we effectively create a new classifier that is guaranteed to be $\ell_p$-robust to adversarial examples, without modifying the pretrained classifier. Our approach applies to both the white-box and the black-box settings of the pretrained classifier. We refer to this defense as denoised smoothing, and we demonstrate its effectiveness through extensive experimentation on ImageNet and CIFAR-10. Finally, we use our approach to provably defend the Azure, Google, AWS, and ClarifAI image classification APIs. Our code replicating all the experiments in the paper can be found at: https://github.com/microsoft/denoised-smoothing.