论文标题
从嘈杂的标签中学习,以粗到精细的样本可信度建模
Learning from Noisy Labels with Coarse-to-Fine Sample Credibility Modeling
论文作者
论文摘要
训练带有嘈杂标签的深神经网络(DNN)实际上是具有挑战性的,因为不准确的标签严重降低了DNN的概括能力。以前的努力倾向于通过识别带有粗糙的小损失标准的嘈杂数据来处理统一的denoising流中的零件或完整数据,以减轻嘈杂标签的干扰,而忽略了嘈杂样本的困难是不同的,因此刚性和统一的数据选择管道无法很好地解决这个问题。在本文中,我们首先提出了一种称为CREMA的粗到最新的鲁棒学习方法,以分裂和互动的方式处理嘈杂的数据。在粗糙的水平,干净和嘈杂的集合首先从统计意义上就可信度分开。由于实际上不可能正确对所有嘈杂样本进行分类,因此我们通过对每个样本的可信度进行建模来进一步处理它们。具体而言,对于清洁集,我们故意设计了一种基于内存的调制方案,以动态调整每个样本在训练过程中的历史可信度顺序方面的贡献,从而减轻了错误分组到清洁集中的噪声样本的效果。同时,对于分类为嘈杂集的样本,提出了选择性标签更新策略,以纠正嘈杂的标签,同时减轻校正错误的问题。广泛的实验是在不同方式的基准上进行的,包括图像分类(CIFAR,Clothing1M等)和文本识别(IMDB),具有合成或自然语义噪声,表明CREMA的优势和普遍性。
Training deep neural network (DNN) with noisy labels is practically challenging since inaccurate labels severely degrade the generalization ability of DNN. Previous efforts tend to handle part or full data in a unified denoising flow via identifying noisy data with a coarse small-loss criterion to mitigate the interference from noisy labels, ignoring the fact that the difficulties of noisy samples are different, thus a rigid and unified data selection pipeline cannot tackle this problem well. In this paper, we first propose a coarse-to-fine robust learning method called CREMA, to handle noisy data in a divide-and-conquer manner. In coarse-level, clean and noisy sets are firstly separated in terms of credibility in a statistical sense. Since it is practically impossible to categorize all noisy samples correctly, we further process them in a fine-grained manner via modeling the credibility of each sample. Specifically, for the clean set, we deliberately design a memory-based modulation scheme to dynamically adjust the contribution of each sample in terms of its historical credibility sequence during training, thus alleviating the effect from noisy samples incorrectly grouped into the clean set. Meanwhile, for samples categorized into the noisy set, a selective label update strategy is proposed to correct noisy labels while mitigating the problem of correction error. Extensive experiments are conducted on benchmarks of different modalities, including image classification (CIFAR, Clothing1M etc) and text recognition (IMDB), with either synthetic or natural semantic noises, demonstrating the superiority and generality of CREMA.