论文标题
SLM:学习语言语言表示,句子取消句子
SLM: Learning a Discourse Language Representation with Sentence Unshuffling
论文作者
论文摘要
我们介绍了句子级的语言建模,这是一个以完全自我监督的方式学习话语语言表示的新的预训练目标。 NLP中的最新预训练方法集中于学习底部或顶级语言表示:在一个极端和通过一个序列的两个给定文本段的订单分类中学到的语言模型目标的上下文化单词表示。但是,这些模型不被直接鼓励捕获以句子及其之间的关系中存在的中等大小结构的表示。为此,我们提出了一种新的方法来鼓励学习上下文化的句子级表示,通过改组输入句子的顺序并训练分层变压器模型以重建原始订购。通过对胶水,小队和迪斯科瓦尔等下游任务的实验,我们表明我们的模型的这一功能可通过大边缘提高原始BERT的性能。
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation in a fully self-supervised manner. Recent pre-training methods in NLP focus on learning either bottom or top-level language representations: contextualized word representations derived from language model objectives at one extreme and a whole sequence representation learned by order classification of two given textual segments at the other. However, these models are not directly encouraged to capture representations of intermediate-size structures that exist in natural languages such as sentences and the relationships among them. To that end, we propose a new approach to encourage learning of a contextualized sentence-level representation by shuffling the sequence of input sentences and training a hierarchical transformer model to reconstruct the original ordering. Through experiments on downstream tasks such as GLUE, SQuAD, and DiscoEval, we show that this feature of our model improves the performance of the original BERT by large margins.