论文标题
我深层网络的表现太好了吗?估计二进制分类中贝叶斯错误的直接方法
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
论文作者
论文摘要
由于预测目标不可避免的不确定性,机器学习模型可以实现的预测性能有一个基本限制。在分类问题中,这可以以贝叶斯错误为特征,这是任何分类器的最佳可实现错误。贝叶斯错误可以用作评估具有最先进性能的分类器的标准,可用于检测测试集过度拟合。我们提出了一个简单而直接的贝叶斯误差估计器,其中我们只采用显示类分配的\ emph {不确定性}标签的平均值。我们的灵活方法使我们能够对弱监督数据进行贝叶斯误差估计。与其他人相反,我们的方法是无模型甚至不含实例的。此外,它没有超参数,并且比几个基线在经验上更准确地估计了贝叶斯误差。使用我们方法的实验表明,最近提出的深层网络(例如视觉变压器)可能已经达到或即将达到基准数据集的贝叶斯错误。最后,我们讨论如何通过估计2017年至2023年ICLR论文的贝叶斯错误来研究科学文章接受/拒绝决定的固有难度。
There is a fundamental limitation in the prediction performance that a machine learning model can achieve due to the inevitable uncertainty of the prediction target. In classification problems, this can be characterized by the Bayes error, which is the best achievable error with any classifier. The Bayes error can be used as a criterion to evaluate classifiers with state-of-the-art performance and can be used to detect test set overfitting. We propose a simple and direct Bayes error estimator, where we just take the mean of the labels that show \emph{uncertainty} of the class assignments. Our flexible approach enables us to perform Bayes error estimation even for weakly supervised data. In contrast to others, our method is model-free and even instance-free. Moreover, it has no hyperparameters and gives a more accurate estimate of the Bayes error than several baselines empirically. Experiments using our method suggest that recently proposed deep networks such as the Vision Transformer may have reached, or is about to reach, the Bayes error for benchmark datasets. Finally, we discuss how we can study the inherent difficulty of the acceptance/rejection decision for scientific articles, by estimating the Bayes error of the ICLR papers from 2017 to 2023.