论文标题
用于医学图像的资源有效域自适应预培训
Resource-efficient domain adaptive pre-training for medical images
论文作者
论文摘要
对医学图像的深度分析由于较高的注释成本和隐私问题而遭受了数据稀缺。该领域的研究人员使用转移学习来避免使用复杂的体系结构时过度拟合。但是,前训练和下游数据之间的域差异阻碍了下游任务的性能。最近的一些研究成功地使用了领域自适应的预训练(DAPT)来解决此问题。在DAPT中,使用通用数据集预先训练的权重初始化模型,并使用适度尺寸的内域数据集(医疗图像)进行进一步的预训练。尽管该技术在准确性和鲁棒性方面为下游任务取得了良好的结果,但即使DAPT的数据集适度尺寸,它在计算上也很昂贵。这些计算密集型技术和模型对环境产生了负面影响,并为资源有限的研究人员创造了不平衡的竞争环境。这项研究提出了计算有效的DAPT,而不会损害下游准确性和鲁棒性。这项研究提出了三种目的的技术,其中第一个(部分DAPT)在一层层的子集上执行DAPT。第二个通过对几个时期进行部分DAPT,然后全dapt来采用混合策略(混合DAPT)。第三种技术对基本体系结构的简化变体进行了DAPT。结果表明,与标准DAPT(Full Dapt)相比,混合DAPT技术在开发和外部数据集上的性能更好。相反,简化的体系结构(DAPT之后)达到了最佳的鲁棒性,同时在开发数据集中实现了适度的性能。
The deep learning-based analysis of medical images suffers from data scarcity because of high annotation costs and privacy concerns. Researchers in this domain have used transfer learning to avoid overfitting when using complex architectures. However, the domain differences between pre-training and downstream data hamper the performance of the downstream task. Some recent studies have successfully used domain-adaptive pre-training (DAPT) to address this issue. In DAPT, models are initialized with the generic dataset pre-trained weights, and further pre-training is performed using a moderately sized in-domain dataset (medical images). Although this technique achieved good results for the downstream tasks in terms of accuracy and robustness, it is computationally expensive even when the datasets for DAPT are moderately sized. These compute-intensive techniques and models impact the environment negatively and create an uneven playing field for researchers with limited resources. This study proposed computationally efficient DAPT without compromising the downstream accuracy and robustness. This study proposes three techniques for this purpose, where the first (partial DAPT) performs DAPT on a subset of layers. The second one adopts a hybrid strategy (hybrid DAPT) by performing partial DAPT for a few epochs and then full DAPT for the remaining epochs. The third technique performs DAPT on simplified variants of the base architecture. The results showed that compared to the standard DAPT (full DAPT), the hybrid DAPT technique achieved better performance on the development and external datasets. In contrast, simplified architectures (after DAPT) achieved the best robustness while achieving modest performance on the development dataset .