关于神经机器翻译模型的稀疏性

论文标题

关于神经机器翻译模型的稀疏性

On the Sparsity of Neural Machine Translation Models

论文作者

Wang, Yong, Wang, Longyue, Li, Victor O. K., Tu, Zhaopeng

论文摘要

现代神经机器翻译（NMT）模型采用大量参数，导致严重的参数化，通常导致计算资源的利用不足。为了应对这个问题，我们从经验上研究是否可以重复使用冗余参数以实现更好的性能。实验和分析是在不同的数据集和NMT体系结构上系统地进行的。我们表明：1）可以将修剪的参数恢复活力，以提高基线模型+0.8 bleu点； 2）重新分配复兴的参数，以增强对低级词汇信息进行建模的能力。

Modern neural machine translation (NMT) models employ a large number of parameters, which leads to serious over-parameterization and typically causes the underutilization of computational resources. In response to this problem, we empirically investigate whether the redundant parameters can be reused to achieve better performance. Experiments and analyses are systematically conducted on different datasets and NMT architectures. We show that: 1) the pruned parameters can be rejuvenated to improve the baseline model by up to +0.8 BLEU points; 2) the rejuvenated parameters are reallocated to enhance the ability of modeling low-level lexical information.

下载PDF全文

下载文献需遵守相关版权规定

论文标题