论文标题

关于神经机器翻译模型的稀疏性

On the Sparsity of Neural Machine Translation Models

论文作者

Wang, Yong, Wang, Longyue, Li, Victor O. K., Tu, Zhaopeng

论文摘要

现代神经机器翻译(NMT)模型采用大量参数,导致严重的参数化,通常导致计算资源的利用不足。为了应对这个问题,我们从经验上研究是否可以重复使用冗余参数以实现更好的性能。实验和分析是在不同的数据集和NMT体系结构上系统地进行的。我们表明:1)可以将修剪的参数恢复活力,以提高基线模型+0.8 bleu点; 2)重新分配复兴的参数,以增强对低级词汇信息进行建模的能力。

Modern neural machine translation (NMT) models employ a large number of parameters, which leads to serious over-parameterization and typically causes the underutilization of computational resources. In response to this problem, we empirically investigate whether the redundant parameters can be reused to achieve better performance. Experiments and analyses are systematically conducted on different datasets and NMT architectures. We show that: 1) the pruned parameters can be rejuvenated to improve the baseline model by up to +0.8 BLEU points; 2) the rejuvenated parameters are reallocated to enhance the ability of modeling low-level lexical information.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源