论文标题
通过随机平滑来证明信心
Certifying Confidence via Randomized Smoothing
论文作者
论文摘要
已证明随机平滑可为高维分类问题提供良好的认证保证。它使用概率来预测平滑分布下输入点围绕输入点的前两个最大的类,以生成分类器预测的认证半径。但是,大多数平滑方法并没有为我们提供有关基础分类器(例如,深神经网络)进行预测的任何置信度的任何信息。在这项工作中,我们提出了一种生成经过认证的半径的方法,以确保平滑分类器的预测信心。我们考虑了两个用于量化置信度的概念:一个类别的平均预测评分以及一个类别的平均预测得分超过另一个类别的缩写。我们修改了Neyman-Pearson引理(随机平滑中的关键定理),以设计一种计算认证半径的过程,其中保证置信度超过一定阈值。我们对CIFAR-10和Imagenet数据集的实验结果表明,使用有关置信分数分布的信息使我们获得了比忽略它更好的认证半径。因此,我们证明在输入点上有关基本分类器的额外信息可以帮助改善平滑分类器的认证保证。实验代码可在https://github.com/aounon/cdf-smoothing上获得。
Randomized smoothing has been shown to provide good certified-robustness guarantees for high-dimensional classification problems. It uses the probabilities of predicting the top two most-likely classes around an input point under a smoothing distribution to generate a certified radius for a classifier's prediction. However, most smoothing methods do not give us any information about the confidence with which the underlying classifier (e.g., deep neural network) makes a prediction. In this work, we propose a method to generate certified radii for the prediction confidence of the smoothed classifier. We consider two notions for quantifying confidence: average prediction score of a class and the margin by which the average prediction score of one class exceeds that of another. We modify the Neyman-Pearson lemma (a key theorem in randomized smoothing) to design a procedure for computing the certified radius where the confidence is guaranteed to stay above a certain threshold. Our experimental results on CIFAR-10 and ImageNet datasets show that using information about the distribution of the confidence scores allows us to achieve a significantly better certified radius than ignoring it. Thus, we demonstrate that extra information about the base classifier at the input point can help improve certified guarantees for the smoothed classifier. Code for the experiments is available at https://github.com/aounon/cdf-smoothing.