论文标题
重新思考机器学习模型评估病理学
Rethinking Machine Learning Model Evaluation in Pathology
论文作者
论文摘要
机器学习已应用于研究和临床实践中的病理图像,并具有令人鼓舞的结果。但是,标准ML模型通常缺乏临床决策所需的严格评估。自然图像的机器学习技术不足以处理明显大且嘈杂,需要昂贵的标签,难以解释并且容易受到虚假相关性的病理图像。我们建议在病理学中针对ML评估的一组实用指南,以解决上述问题。本文包括设置评估框架的措施,有效地处理标签的可变性,以及推荐的测试套件,以解决与域转移,稳健性和混杂变量有关的问题。我们希望所提出的框架能够弥合ML研究人员与领域专家之间的差距,从而导致更广泛地采用病理学中的ML技术并改善患者的结果。
Machine Learning has been applied to pathology images in research and clinical practice with promising outcomes. However, standard ML models often lack the rigorous evaluation required for clinical decisions. Machine learning techniques for natural images are ill-equipped to deal with pathology images that are significantly large and noisy, require expensive labeling, are hard to interpret, and are susceptible to spurious correlations. We propose a set of practical guidelines for ML evaluation in pathology that address the above concerns. The paper includes measures for setting up the evaluation framework, effectively dealing with variability in labels, and a recommended suite of tests to address issues related to domain shift, robustness, and confounding variables. We hope that the proposed framework will bridge the gap between ML researchers and domain experts, leading to wider adoption of ML techniques in pathology and improving patient outcomes.