论文标题
通过通用表示和交叉映射改善零拍的多语言翻译
Improving Zero-Shot Multilingual Translation with Universal Representations and Cross-Mappings
论文作者
论文摘要
多对多的多语言神经机器翻译可以在训练期间看不见的语言对之间翻译,即零射击翻译。改善零射击翻译需要模型学习通用表示和交叉映射关系,以将从监督方向学习的知识转移到零摄像的方向上。在这项工作中,我们根据最佳理论提出了状态移动的距离,以模拟编码器输出的表示的差异。然后,我们通过最大程度地减少了学习通用表示的拟议距离,弥合了代币层面不同语言的语义等效表示之间的差距。此外,我们提出了一种基于协议的培训计划,该计划可以帮助模型根据语义等效句子做出一致的预测,以了解所有翻译方向的通用交叉映射关系。各种多语言数据集的实验结果表明,与基线系统和其他对比方法相比,我们的方法可以始终如一地改善。分析证明我们的方法可以更好地对齐语义空间并提高预测一致性。
The many-to-many multilingual neural machine translation can translate between language pairs unseen during training, i.e., zero-shot translation. Improving zero-shot translation requires the model to learn universal representations and cross-mapping relationships to transfer the knowledge learned on the supervised directions to the zero-shot directions. In this work, we propose the state mover's distance based on the optimal theory to model the difference of the representations output by the encoder. Then, we bridge the gap between the semantic-equivalent representations of different languages at the token level by minimizing the proposed distance to learn universal representations. Besides, we propose an agreement-based training scheme, which can help the model make consistent predictions based on the semantic-equivalent sentences to learn universal cross-mapping relationships for all translation directions. The experimental results on diverse multilingual datasets show that our method can improve consistently compared with the baseline system and other contrast methods. The analysis proves that our method can better align the semantic space and improve the prediction consistency.