通过表示和预测示例中的跨度的释义生成的生成预训练

论文标题

通过表示和预测示例中的跨度的释义生成的生成预训练

Generative Pre-training for Paraphrase Generation by Representing and Predicting Spans in Exemplars

论文作者

Bui, Tien-Cuong, Le, Van-Duc, To, Hai-Thien, Cha, Sang Kyun

论文摘要

释义是一个长期存在的问题，在许多自然语言处理问题中起着至关重要的作用。尽管结果有些令人鼓舞，但最近的方法要么面临着偏爱通用话语的问题，要么需要为每个新数据集重新研究模型。本文提出了一种从GPT-2模型扩展的释义句子的新方法。我们开发了一种名为一阶掩码的模板掩蔽技术，以利用POS标记器的示例中掩盖了无关的单词。因此，释义任务更改为预测蒙版模板中的跨度。我们提出的方法的表现优于竞争基线，尤其是在语义保存方面。为了防止模型偏向给定的模板，我们引入了一种技术，称为二阶掩蔽，该技术利用Bernoulli分布来控制一阶屏蔽模板令牌的可见性。此外，该技术允许模型通过调整二阶屏蔽水平来提供测试中的各种释义句子。对于扩大目标，我们比较了两种替代模板选择方法的性能，这表明它们在保存语义信息方面相当。

Paraphrase generation is a long-standing problem and serves an essential role in many natural language processing problems. Despite some encouraging results, recent methods either confront the problem of favoring generic utterance or need to retrain the model from scratch for each new dataset. This paper presents a novel approach to paraphrasing sentences, extended from the GPT-2 model. We develop a template masking technique, named first-order masking, to masked out irrelevant words in exemplars utilizing POS taggers. So that, the paraphrasing task is changed to predicting spans in masked templates. Our proposed approach outperforms competitive baselines, especially in the semantic preservation aspect. To prevent the model from being biased towards a given template, we introduce a technique, referred to as second-order masking, which utilizes Bernoulli distribution to control the visibility of the first-order-masked template's tokens. Moreover, this technique allows the model to provide various paraphrased sentences in testing by adjusting the second-order-masking level. For scale-up objectives, we compare the performance of two alternatives template-selection methods, which shows that they were equivalent in preserving semantic information.

下载PDF全文

下载文献需遵守相关版权规定

论文标题