ProSelfLC：逐步自我标签校正低温熵状态

论文标题

ProSelfLC：逐步自我标签校正低温熵状态

ProSelfLC: Progressive Self Label Correction Towards A Low-Temperature Entropy State

论文作者

Wang, Xinshao, Hua, Yang, Kodirov, Elyor, Mukherjee, Sankha Subhra, Clifton, David A., Robertson, Neil M.

论文摘要

有一种标签修改方法系列，包括自我和非自动标签校正（LC）和输出正则化。它们被广泛用于培训强大的深神经网络（DNN），但并未在数学上和彻底进行彻底分析。我们研究它们并发现了三个关键问题：（1）我们对采用自我LC更感兴趣，因为它利用了自己的知识，不需要辅助模型。但是，尚不清楚如何随着培训的进行自适应信任学习者。（2）一些方法会受到惩罚，而其他方法则奖励低渗透率（即高信任）预测，促使我们询问哪一种更好。（3）使用标准训练设置，当存在严重的噪音时，学习的模型就会变得不那么自信。使用高渗透知识的自LC将产生高渗透目标。为了解决问题（1）的灵感，灵感来自一个良好的发现，即深度神经网络在拟合噪声之前就学习有意义的模式，我们提出了一种名为proselflc的新颖端到端方法，该方法是根据学习时间和预测熵设计的。具体而言，对于任何数据点，如果网络经过相对较长的时间训练，并且预测为低熵，那么我们逐渐和自适应地信任其预测的概率分布而不是注释。对于（2）问题，Proselflc的有效性捍卫了熵的最小化。通过ProSelfLC，我们从经验上证明，重新定义语义低渗透状态并优化学习者对其进行优化更有效。为了解决问题（3），我们在利用低温之前使用低温降低自知识的熵，以便修订后的标签重新定义低透镜目标概率分布。我们通过在清洁和嘈杂的环境以及图像和蛋白质数据集中进行的广泛实验来证明ProSelfLC的有效性。

There is a family of label modification approaches including self and non-self label correction (LC), and output regularisation. They are widely used for training robust deep neural networks (DNNs), but have not been mathematically and thoroughly analysed together. We study them and discover three key issues: (1) We are more interested in adopting Self LC as it leverages its own knowledge and requires no auxiliary models. However, it is unclear how to adaptively trust a learner as the training proceeds. (2) Some methods penalise while the others reward low-entropy (i.e., high-confidence) predictions, prompting us to ask which one is better. (3) Using the standard training setting, a learned model becomes less confident when severe noise exists. Self LC using high-entropy knowledge would generate high-entropy targets. To resolve the issue (1), inspired by a well-accepted finding, i.e., deep neural networks learn meaningful patterns before fitting noise, we propose a novel end-to-end method named ProSelfLC, which is designed according to the learning time and prediction entropy. Concretely, for any data point, we progressively and adaptively trust its predicted probability distribution versus its annotated one if a network has been trained for a relatively long time and the prediction is of low entropy. For the issue (2), the effectiveness of ProSelfLC defends entropy minimisation. By ProSelfLC, we empirically prove that it is more effective to redefine a semantic low-entropy state and optimise the learner toward it. To address the issue (3), we decrease the entropy of self knowledge using a low temperature before exploiting it to correct labels, so that the revised labels redefine low-entropy target probability distributions. We demonstrate the effectiveness of ProSelfLC through extensive experiments in both clean and noisy settings, and on both image and protein datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题