论文标题
对抗风险,插值和标签噪声的定律
A law of adversarial risk, interpolation, and label noise
论文作者
论文摘要
在监督的学习中,已经表明,数据中的标签噪声可以在未对测试准确性的情况下进行插值。我们表明,插值标签噪声会引起对抗性漏洞,并证明了第一个定理显示标签噪声和对对抗性风险之间的任何数据分布之间的关系。如果我们不对学习算法的归纳偏差做出任何假设,我们的结果几乎会很紧张。然后,我们研究了此问题的不同组成部分如何影响该结果,包括分布的特性。我们还讨论了不均匀的标签噪声分布。并证明一种新定理显示均匀的标签噪声几乎引起了对抗性风险,与最严重的中毒相同。然后,我们提供了理论和经验证据,表明均匀的标签噪声比典型的现实世界标签噪声更有害。最后,我们展示了归纳性偏见如何扩大标签噪声的效果,并认为需要朝这个方向朝着这一方向进行工作。
In supervised learning, it has been shown that label noise in the data can be interpolated without penalties on test accuracy. We show that interpolating label noise induces adversarial vulnerability, and prove the first theorem showing the relationship between label noise and adversarial risk for any data distribution. Our results are almost tight if we do not make any assumptions on the inductive bias of the learning algorithm. We then investigate how different components of this problem affect this result, including properties of the distribution. We also discuss non-uniform label noise distributions; and prove a new theorem showing uniform label noise induces nearly as large an adversarial risk as the worst poisoning with the same noise rate. Then, we provide theoretical and empirical evidence that uniform label noise is more harmful than typical real-world label noise. Finally, we show how inductive biases amplify the effect of label noise and argue the need for future work in this direction.