图像和文本上的自学降低了对视觉快捷功能的依赖

论文标题

图像和文本上的自学降低了对视觉快捷功能的依赖

Self-Supervision on Images and Text Reduces Reliance on Visual Shortcut Features

论文作者

Palepu, Anil, Beam, Andrew L

论文摘要

以完全监督的方式训练的深度学习模型已被证明依靠所谓的“快捷方式”功能。快捷键功能是与培训数据感兴趣的结果相关的输入，但不再关联或在测试或部署设置中存在。在这里，我们提供的实验显示了在图像和文本上训练的最新自我监督模型提供了更强大的图像表示形式，并在现实的医学成像示例中降低了模型对视觉快捷键特征的依赖。此外，我们发现这些自我监督模型“忘记”快捷方式比在标记数据进行微调时要比完全监督的模型更快。尽管不是一个完整的解决方案，但我们的实验提供了令人信服的证据，表明在图像和文本上训练的自我监督模型为视觉快捷特征提供了韧性。

Deep learning models trained in a fully supervised manner have been shown to rely on so-called "shortcut" features. Shortcut features are inputs that are associated with the outcome of interest in the training data, but are either no longer associated or not present in testing or deployment settings. Here we provide experiments that show recent self-supervised models trained on images and text provide more robust image representations and reduce the model's reliance on visual shortcut features on a realistic medical imaging example. Additionally, we find that these self-supervised models "forget" shortcut features more quickly than fully supervised ones when fine-tuned on labeled data. Though not a complete solution, our experiments provide compelling evidence that self-supervised models trained on images and text provide some resilience to visual shortcut features.

下载PDF全文

下载文献需遵守相关版权规定

论文标题