论文标题
无监督的机器翻译的流程适应器体系结构
Flow-Adapter Architecture for Unsupervised Machine Translation
论文作者
论文摘要
在这项工作中,我们为无监督的NMT提出了一个流程适应器架构。它利用标准化流量来明确对句子级潜在表示的分布进行建模,后来与转换任务的注意机制结合使用。我们模型的主要新颖性是:(a)使用标准化流量为每种语言分别捕获特定于语言的句子表示,并且(b)使用这些潜在表示的简单转换,以将一种语言从一种语言转换为另一种语言。这种体系结构允许独立对每种语言进行无监督的培训。据我们所知,虽然对监督MT的潜在变量进行了先前的工作,但这是第一项使用潜在变量并将流量标准化为无监督的MT的工作。我们在几个无监督的MT基准上获得竞争结果。
In this work, we propose a flow-adapter architecture for unsupervised NMT. It leverages normalizing flows to explicitly model the distributions of sentence-level latent representations, which are subsequently used in conjunction with the attention mechanism for the translation task. The primary novelties of our model are: (a) capturing language-specific sentence representations separately for each language using normalizing flows and (b) using a simple transformation of these latent representations for translating from one language to another. This architecture allows for unsupervised training of each language independently. While there is prior work on latent variables for supervised MT, to the best of our knowledge, this is the first work that uses latent variables and normalizing flows for unsupervised MT. We obtain competitive results on several unsupervised MT benchmarks.