清晰：句子表示的对比度学习

论文标题

清晰：句子表示的对比度学习

CLEAR: Contrastive Learning for Sentence Representation

论文作者

Wu, Zhuofeng, Wang, Sinong, Gu, Jiatao, Khabsa, Madian, Sun, Fei, Ma, Hao

论文摘要

预训练的语言模型已经证明了它们在捕获隐式语言特征方面的独特力量。但是，大多数培训方法都集中在单词级培训目标上，而句子级别的目标很少进行。在本文中，我们提出了对句子表示的对比度学习（CLEAR），该学习采用多个句子级的增强策略来学习噪声不变的句子表示。这些增强包括单词和跨度删除，重新排序和替换。此外，我们研究了通过众多实验使对比度学习有效的关键原因。我们观察到，预训练期间不同的句子增加会导致各种下游任务的不同绩效改进。我们的方法显示出在Senteval和胶水基准上都优于多种现有方法。

Pre-trained language models have proven their unique powers in capturing implicit language features. However, most pre-training approaches focus on the word-level training objective, while sentence-level objectives are rarely studied. In this paper, we propose Contrastive LEArning for sentence Representation (CLEAR), which employs multiple sentence-level augmentation strategies in order to learn a noise-invariant sentence representation. These augmentations include word and span deletion, reordering, and substitution. Furthermore, we investigate the key reasons that make contrastive learning effective through numerous experiments. We observe that different sentence augmentations during pre-training lead to different performance improvements on various downstream tasks. Our approach is shown to outperform multiple existing methods on both SentEval and GLUE benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题