论文标题
具有语言特定子网的数据有效跨语性转移
Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks
论文作者
论文摘要
大型多语言模型通常会在所有语言上共享其参数,这可以跨语言任务转移,但是当来自不同语言的培训更新发生冲突时,也可以阻碍学习。在本文中,我们提出了使用特定于语言的子网(控制跨语性参数共享)来减少冲突并增加微调过程中的正转移的新方法。我们介绍了动态子网,这些子网与模型共同更新,并将方法与元学习,一种已建立但互补的技术,用于改善跨语言转移。最后,我们提供了有关每种方法如何影响模型的广泛分析。
Large multilingual language models typically share their parameters across all languages, which enables cross-lingual task transfer, but learning can also be hindered when training updates from different languages are in conflict. In this paper, we propose novel methods for using language-specific subnetworks, which control cross-lingual parameter sharing, to reduce conflicts and increase positive transfer during fine-tuning. We introduce dynamic subnetworks, which are jointly updated with the model, and we combine our methods with meta-learning, an established, but complementary, technique for improving cross-lingual transfer. Finally, we provide extensive analyses of how each of our methods affects the models.