与适配器的跨语言密集检索的参数效率零射传输

论文标题

与适配器的跨语言密集检索的参数效率零射传输

Parameter-efficient Zero-shot Transfer for Cross-Language Dense Retrieval with Adapters

论文作者

Yang, Eugene, Nair, Suraj, Lawrie, Dawn, Mayfield, James, Oard, Douglas W.

论文摘要

创建零拍的跨语言检索模型的一种流行方法是用多语言预审计的语言模型（例如多语言bert）在检索模型中替换单语预审计的语言模型。使用与单语的单语检索模型相同的训练配方，使用单语言数据（例如英语MARCO），将这种多语言模型罚款对检索任务进行了罚款。但是，这种转移的模型在培训和推论过程中遭受输入文本语言的不匹配。在这项工作中，我们建议使用适配器（变压器网络的参数有效组件）传输单语的检索模型。通过在具有特定于任务适配器的特定语言的语言任务上添加适配器，先前的工作表明，适配器增强模型的性能要比在各种NLP任务中跨语言传输时的整个模型要好。通过使用适配器构建密集的检索模型，我们表明，在转移到跨语言信息检索（CLIR）设置时，接受单语言数据训练的模型比对整个模型进行微调更有效。但是，我们发现，在推理时间，替换语言适配器以匹配目标语言的先前建议对于密集的检索模型来说是次优的。我们对其他跨语言NLP任务和Clir之间的这种差异进行了深入的分析。

A popular approach to creating a zero-shot cross-language retrieval model is to substitute a monolingual pretrained language model in the retrieval model with a multilingual pretrained language model such as Multilingual BERT. This multilingual model is fined-tuned to the retrieval task with monolingual data such as English MS MARCO using the same training recipe as the monolingual retrieval model used. However, such transferred models suffer from mismatches in the languages of the input text during training and inference. In this work, we propose transferring monolingual retrieval models using adapters, a parameter-efficient component for a transformer network. By adding adapters pretrained on language tasks for a specific language with task-specific adapters, prior work has shown that the adapter-enhanced models perform better than fine-tuning the entire model when transferring across languages in various NLP tasks. By constructing dense retrieval models with adapters, we show that models trained with monolingual data are more effective than fine-tuning the entire model when transferring to a Cross Language Information Retrieval (CLIR) setting. However, we found that the prior suggestion of replacing the language adapters to match the target language at inference time is suboptimal for dense retrieval models. We provide an in-depth analysis of this discrepancy between other cross-language NLP tasks and CLIR.

下载PDF全文

下载文献需遵守相关版权规定

论文标题