多源域适应的强大目标训练

论文标题

多源域适应的强大目标训练

Robust Target Training for Multi-Source Domain Adaptation

论文作者

Deng, Zhongying, Li, Da, Song, Yi-Zhe, Xiang, Tao

论文摘要

给定多个标记的源域和一个单个目标域，大多数现有的多源域适应性（MSDA）模型在一个步骤中共同培训了来自所有域的数据。这样的一步方法限制了它们适应目标域的能力。这是因为训练集由越多的标记源域数据主导。通过引入第二个训练步骤，可以通过引入第二个训练步骤来减轻来源域偏置，在该步骤中，仅使用伪标签作为监督，该模型仅使用未标记的目标域数据进行微调。但是，伪标签不可避免地嘈杂，如果使用不受限制，则会对模型性能产生负面影响。为了解决这个问题，我们为MSDA提出了一种基于双重优化的新型BI级优化核能训练（Bort $^2 $）方法。考虑到任何现有的全面训练的一步MSDA模型，Bort $^2 $将其转换为标签功能，以生成目标数据的伪标记，并仅使用伪标记的目标数据训练目标模型。至关重要的是，目标模型是一种随机CNN，其设计为与标签函数产生的标记噪声本质上可靠。这样的随机CNN将每个目标实例的特征模型为高斯分布，其熵最大化的正常器部署以测量标签不确定性，这将进一步利用，以减轻噪声伪标签的负面影响。训练标签函数和目标模型带来了一个嵌套的双层优化问题，我们为此基于隐性分化制定了优雅的解决方案。广泛的实验表明，我们提出的方法在包括大规模域名数据集在内的三个MSDA基准上实现了最先进的性能状态。我们的代码将在\ url {https://github.com/zhongying-deng/bort2}提供

Given multiple labeled source domains and a single target domain, most existing multi-source domain adaptation (MSDA) models are trained on data from all domains jointly in one step. Such an one-step approach limits their ability to adapt to the target domain. This is because the training set is dominated by the more numerous and labeled source domain data. The source-domain-bias can potentially be alleviated by introducing a second training step, where the model is fine-tuned with the unlabeled target domain data only using pseudo labels as supervision. However, the pseudo labels are inevitably noisy and when used unchecked can negatively impact the model performance. To address this problem, we propose a novel Bi-level Optimization based Robust Target Training (BORT$^2$) method for MSDA. Given any existing fully-trained one-step MSDA model, BORT$^2$ turns it to a labeling function to generate pseudo-labels for the target data and trains a target model using pseudo-labeled target data only. Crucially, the target model is a stochastic CNN which is designed to be intrinsically robust against label noise generated by the labeling function. Such a stochastic CNN models each target instance feature as a Gaussian distribution with an entropy maximization regularizer deployed to measure the label uncertainty, which is further exploited to alleviate the negative impact of noisy pseudo labels. Training the labeling function and the target model poses a nested bi-level optimization problem, for which we formulate an elegant solution based on implicit differentiation. Extensive experiments demonstrate that our proposed method achieves the state of the art performance on three MSDA benchmarks, including the large-scale DomainNet dataset. Our code will be available at \url{https://github.com/Zhongying-Deng/BORT2}

下载PDF全文

下载文献需遵守相关版权规定

论文标题