自适应培训：超出经验风险最小化

论文标题

自适应培训：超出经验风险最小化

Self-Adaptive Training: beyond Empirical Risk Minimization

论文作者

Huang, Lang, Zhang, Chao, Zhang, Hongyang

论文摘要

我们提出了自适应培训---一种新的培训算法，该算法通过模型预测动态纠正问题的培训标签，而不会产生额外的计算成本，以改善对潜在损坏的培训数据的深度学习的概括。这个问题对于从损坏的数据中，例如标签噪声和分发样本损坏的数据至关重要。但是，对于此类数据，标准的经验风险最小化（ERM）可能很容易过分贴合声，因此具有亚最佳性能。在本文中，我们观察到模型预测可以实质上有益于训练过程：自适应训练在各种噪声下显着改善了对ERM的概括，并减轻了自然和对抗性训练中过度拟合的问题。我们评估了自适应训练的误差曲线：测试误差是单调降低的W.R.T.模型容量。这与ERM中最近发现的双新变态现象形成鲜明对比，这可能是由于噪音过度拟合的结果。在CIFAR和Imagenet数据集上进行的实验验证了我们方法在两个应用中的有效性：标签噪声和选择性分类的分类。我们在https://github.com/layneh/self-aptive-training上发布代码。

We propose self-adaptive training---a new training algorithm that dynamically corrects problematic training labels by model predictions without incurring extra computational cost---to improve generalization of deep learning for potentially corrupted training data. This problem is crucial towards robustly learning from data that are corrupted by, e.g., label noises and out-of-distribution samples. The standard empirical risk minimization (ERM) for such data, however, may easily overfit noises and thus suffers from sub-optimal performance. In this paper, we observe that model predictions can substantially benefit the training process: self-adaptive training significantly improves generalization over ERM under various levels of noises, and mitigates the overfitting issue in both natural and adversarial training. We evaluate the error-capacity curve of self-adaptive training: the test error is monotonously decreasing w.r.t. model capacity. This is in sharp contrast to the recently-discovered double-descent phenomenon in ERM which might be a result of overfitting of noises. Experiments on CIFAR and ImageNet datasets verify the effectiveness of our approach in two applications: classification with label noise and selective classification. We release our code at https://github.com/LayneH/self-adaptive-training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题