具有语言特定子网的数据有效跨语性转移

论文标题

具有语言特定子网的数据有效跨语性转移

Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks

论文作者

Choenni, Rochelle, Garrette, Dan, Shutova, Ekaterina

论文摘要

大型多语言模型通常会在所有语言上共享其参数，这可以跨语言任务转移，但是当来自不同语言的培训更新发生冲突时，也可以阻碍学习。在本文中，我们提出了使用特定于语言的子网（控制跨语性参数共享）来减少冲突并增加微调过程中的正转移的新方法。我们介绍了动态子网，这些子网与模型共同更新，并将方法与元学习，一种已建立但互补的技术，用于改善跨语言转移。最后，我们提供了有关每种方法如何影响模型的广泛分析。

Large multilingual language models typically share their parameters across all languages, which enables cross-lingual task transfer, but learning can also be hindered when training updates from different languages are in conflict. In this paper, we propose novel methods for using language-specific subnetworks, which control cross-lingual parameter sharing, to reduce conflicts and increase positive transfer during fine-tuning. We introduce dynamic subnetworks, which are jointly updated with the model, and we combine our methods with meta-learning, an established, but complementary, technique for improving cross-lingual transfer. Finally, we provide extensive analyses of how each of our methods affects the models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题