论文标题
富含质量的弗拉梅内特(Framenet)在神经机器翻译中的域改编
Domain Adaptation in Neural Machine Translation using a Qualia-Enriched FrameNet
论文作者
论文摘要
在本文中,我们介绍了Scylla,这是一种神经机器翻译(NMT)系统的域适应性方法,该方法利用多种语言的Framenet作为外部知识库。 NMT中使用的域适应技术通常需要微调和内域培训数据,这可能会给使用较少资源的语言工作的人带来困难,并且还可能导致NMT系统的性能衰减,以实现室外句子。 Scylla不需要对NMT模型进行微调,避免了模型过度拟合的风险,并导致室外翻译的性能下降。提出了两个版本的Scylla:一种使用源句子作为输入,另一个使用目标句子。在一个实验中,我们评估了Scylla与最先进的商业NMT系统相比,从巴西葡萄牙语将50个句子从巴西葡萄牙语转换为英语。 Scylla的两个版本在HTER中显着优于基线商业系统。
In this paper we present Scylla, a methodology for domain adaptation of Neural Machine Translation (NMT) systems that make use of a multilingual FrameNet enriched with qualia relations as an external knowledge base. Domain adaptation techniques used in NMT usually require fine-tuning and in-domain training data, which may pose difficulties for those working with lesser-resourced languages and may also lead to performance decay of the NMT system for out-of-domain sentences. Scylla does not require fine-tuning of the NMT model, avoiding the risk of model over-fitting and consequent decrease in performance for out-of-domain translations. Two versions of Scylla are presented: one using the source sentence as input, and another one using the target sentence. We evaluate Scylla in comparison to a state-of-the-art commercial NMT system in an experiment in which 50 sentences from the Sports domain are translated from Brazilian Portuguese to English. The two versions of Scylla significantly outperform the baseline commercial system in HTER.