论文标题

GRET:全球表示增强的变压器

GRET: Global Representation Enhanced Transformer

论文作者

Weng, Rongxiang, Wei, Haoran, Huang, Shujian, Yu, Heng, Bing, Lidong, Luo, Weihua, Chen, Jiajun

论文摘要

基于编码器框架的Transformer在几个自然语言生成任务上实现了最先进的性能。编码器将输入句子中的单词映射到一个隐藏状态的序列中,然后将其馈入解码器以生成输出句子。这些隐藏的状态通常对应于输入词,并专注于捕获本地信息。但是,很少探索全球(句子级别)信息,为提高发电质量的空间留出了空间。在本文中,我们提出了一种新颖的全局表示增强的变压器(GRET),以明确对变压器网络中的全局表示形式进行建模。具体而言,在拟议的模型中,为来自编码器的全局表示形式生成了外部状态。然后,在解码过程中,将全球表示形式融合到解码器中,以提高发电质量。我们在两个文本生成任务中进行实验:机器翻译和文本摘要。四个WMT机器翻译任务和LCST的实验结果文本摘要任务证明了拟议方法对自然语言生成的有效性。

Transformer, based on the encoder-decoder framework, has achieved state-of-the-art performance on several natural language generation tasks. The encoder maps the words in the input sentence into a sequence of hidden states, which are then fed into the decoder to generate the output sentence. These hidden states usually correspond to the input words and focus on capturing local information. However, the global (sentence level) information is seldom explored, leaving room for the improvement of generation quality. In this paper, we propose a novel global representation enhanced Transformer (GRET) to explicitly model global representation in the Transformer network. Specifically, in the proposed model, an external state is generated for the global representation from the encoder. The global representation is then fused into the decoder during the decoding process to improve generation quality. We conduct experiments in two text generation tasks: machine translation and text summarization. Experimental results on four WMT machine translation tasks and LCSTS text summarization task demonstrate the effectiveness of the proposed approach on natural language generation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源