论文标题
与条件分布匹配和广义标签偏移的域适应
Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift
论文作者
论文摘要
对抗性学习通过学习域名的表示,在无监督的域适应设置中表现出良好的性能。但是,当标签分布在源域和目标域之间有所不同时,最近的工作表明了这种方法的局限性。在本文中,我们提出了一个新的假设,广义标签偏移($ gls $),以提高针对不匹配标签分布的稳健性。 $ gls $指出,以标签为条件,存在源和目标域之间不变的输入的表示。在$ gls $下,我们提供了关于任何分类器的转移性能的理论保证。我们还通过估计域之间的相对级别权重和适当的样本重新加权,为$ gls $持有必要的条件。我们的权重估计方法可以直接且一致地应用于现有的域适应性(DA)算法,这些算法以小型计算开销学习域不变表示。特别是,我们修改了三种DA算法,即Jan,Dann和CDAN,并评估它们在标准和人工DA任务上的性能。我们的算法优于基本版本,对于大型标签分布不匹配而进行了巨大改进。我们的代码可在https://tinyurl.com/y585xt6j上找到。
Adversarial learning has demonstrated good performance in the unsupervised domain adaptation setting, by learning domain-invariant representations. However, recent work has shown limitations of this approach when label distributions differ between the source and target domains. In this paper, we propose a new assumption, generalized label shift ($GLS$), to improve robustness against mismatched label distributions. $GLS$ states that, conditioned on the label, there exists a representation of the input that is invariant between the source and target domains. Under $GLS$, we provide theoretical guarantees on the transfer performance of any classifier. We also devise necessary and sufficient conditions for $GLS$ to hold, by using an estimation of the relative class weights between domains and an appropriate reweighting of samples. Our weight estimation method could be straightforwardly and generically applied in existing domain adaptation (DA) algorithms that learn domain-invariant representations, with small computational overhead. In particular, we modify three DA algorithms, JAN, DANN and CDAN, and evaluate their performance on standard and artificial DA tasks. Our algorithms outperform the base versions, with vast improvements for large label distribution mismatches. Our code is available at https://tinyurl.com/y585xt6j.