论文标题
最近的邻居的认证鲁棒性,以防止数据中毒和后门攻击
Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks
论文作者
论文摘要
数据中毒攻击和后门攻击旨在通过修改,添加和/或删除一些精心选择的培训示例来破坏机器学习分类器,从而使损坏的分类器成为攻击者欲望的错误预测。针对数据中毒攻击和后门攻击的最新认证防御措施的关键思想是创建多数投票机制来预测测试示例的标签。此外,每个选民都是在培训数据集的子集上培训的基本分类器。经典的简单学习算法,例如最近的邻居(KNN)和半径最近的邻居(RNN)具有内在的多数投票机制。在这项工作中,我们表明,KNN和RNN中的内在多数投票机制已经为数据中毒攻击和后门攻击提供了认证的鲁棒性保证。此外,我们对MNIST和CIFAR10的评估结果表明,KNN和RNN的固有认证鲁棒性保证胜过最先进的认证防御能力。我们的结果是针对数据中毒攻击和后门攻击的未来认证防御措施的标准基准。
Data poisoning attacks and backdoor attacks aim to corrupt a machine learning classifier via modifying, adding, and/or removing some carefully selected training examples, such that the corrupted classifier makes incorrect predictions as the attacker desires. The key idea of state-of-the-art certified defenses against data poisoning attacks and backdoor attacks is to create a majority vote mechanism to predict the label of a testing example. Moreover, each voter is a base classifier trained on a subset of the training dataset. Classical simple learning algorithms such as k nearest neighbors (kNN) and radius nearest neighbors (rNN) have intrinsic majority vote mechanisms. In this work, we show that the intrinsic majority vote mechanisms in kNN and rNN already provide certified robustness guarantees against data poisoning attacks and backdoor attacks. Moreover, our evaluation results on MNIST and CIFAR10 show that the intrinsic certified robustness guarantees of kNN and rNN outperform those provided by state-of-the-art certified defenses. Our results serve as standard baselines for future certified defenses against data poisoning attacks and backdoor attacks.