论文标题

匹配脚本,适应多语言:分析多语言预读对跨语性可传递性的影响

Match the Script, Adapt if Multilingual: Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability

论文作者

Fujinuma, Yoshinari, Boyd-Graber, Jordan, Kann, Katharina

论文摘要

预处理的多语言模型即使对于看不见的语言也可以零射门学习,并且在填充之前可以通过适应来进一步提高性能。但是,目前尚不清楚训练训练的语言的数量如何影响模型在训练过程中看不见的语言的零局部学习。为了填补这一空白,我们提出以下研究问题:(1)训练训练的语言数量如何影响看不见的目标语言的零拍摄性能? (2)该问题的答案是否随模型适应而改变? (3)如果用于预处理的语言都是相关的,我们的第一个问题的发现会改变吗?我们对相关语言进行预读的实验表明,选择多种语言是至关重要的。如果没有模型适应,令人惊讶的是,增加预科语言的数量会产生更好的结果,以增加相关语言,此后性能高原。相比之下,通过持续预处理的模型适应,对大量语言进行预处理通常会进一步改进,这表明模型适应对于利用其他预刻录的语言至关重要。

Pretrained multilingual models enable zero-shot learning even for unseen languages, and that performance can be further improved via adaptation prior to finetuning. However, it is unclear how the number of pretraining languages influences a model's zero-shot learning for languages unseen during pretraining. To fill this gap, we ask the following research questions: (1) How does the number of pretraining languages influence zero-shot performance on unseen target languages? (2) Does the answer to that question change with model adaptation? (3) Do the findings for our first question change if the languages used for pretraining are all related? Our experiments on pretraining with related languages indicate that choosing a diverse set of languages is crucial. Without model adaptation, surprisingly, increasing the number of pretraining languages yields better results up to adding related languages, after which performance plateaus. In contrast, with model adaptation via continued pretraining, pretraining on a larger number of languages often gives further improvement, suggesting that model adaptation is crucial to exploit additional pretraining languages.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源