部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Robust Domain Adaptation for Pre-trained Multilingual Neural Machine Translation Models

论文作者

Grosso, Mathieu, Ratnamogan, Pirashanth, Mathey, Alexis, Vanhuffel, William, Fotso, Michael Fotso

论文摘要

最近的文献证明了多语言神经机器翻译（MNMT）模型的潜力。但是，最有效的模型不太适合专业行业。在这些情况下，内部数据在所有语言对中都稀缺且昂贵。因此，很难在专用域上微调MNMT模型。在这种情况下，我们决定专注于一项新任务：在一对语言上适应了预训练的MNMT模型，同时试图在所有语言对的通用域数据上保持模型质量。通用域和其他对损失的风险很高。此任务是该行业中MNMT模型采用的关键，并且处于许多其他领域。我们为通用MNMT提出了一个微调程序，该程序结合了嵌入式冻结和对抗性损失。我们的实验表明，与幼稚的标准方法相比，该过程在所有语言对的通用域上的初始性能最少损失（+10.0 bleu bleu在特殊数据上，-0.0.01至-0.5 bleu wmt and wmt and tatoeba DataSet in the wmt and tataeba dataset coption of Aptial toe toss consporture to in All语言成对的初始性能最少，在WMT和其他与M2M2M100的其他Pairs上的Tatoeba DataSets上，该过程提高了。

Recent literature has demonstrated the potential of multilingual Neural Machine Translation (mNMT) models. However, the most efficient models are not well suited to specialized industries. In these cases, internal data is scarce and expensive to find in all language pairs. Therefore, fine-tuning a mNMT model on a specialized domain is hard. In this context, we decided to focus on a new task: Domain Adaptation of a pre-trained mNMT model on a single pair of language while trying to maintain model quality on generic domain data for all language pairs. The risk of loss on generic domain and on other pairs is high. This task is key for mNMT model adoption in the industry and is at the border of many others. We propose a fine-tuning procedure for the generic mNMT that combines embeddings freezing and adversarial loss. Our experiments demonstrated that the procedure improves performances on specialized data with a minimal loss in initial performances on generic domain for all languages pairs, compared to a naive standard approach (+10.0 BLEU score on specialized data, -0.01 to -0.5 BLEU on WMT and Tatoeba datasets on the other pairs with M2M100).

下载PDF全文

下载文献需遵守相关版权规定

论文标题