论文标题

对句子编码器预训练的跨思考

Cross-Thought for Sentence Encoder Pre-training

论文作者

Wang, Shuohang, Fang, Yuwei, Sun, Siqi, Gan, Zhe, Cheng, Yu, Jiang, Jing, Liu, Jingjing

论文摘要

在本文中,我们提出了跨思考,这是一种新型的训练序列编码器的方法,该方法在构建可重复使用的序列嵌入的大规模NLP任务(例如问题回答)方面具有重要作用。我们没有使用完整句子的原始信号,而是通过大量短序列训练基于变压器的序列编码器,该序列允许该模型自动选择用于预测掩盖单词的最有用的信息。关于问答的实验和文本需要任务表明,我们的预训练的编码器可以胜过经过连续句子信号以及传统蒙版语言建模基线的最先进的编码器。我们提出的方法还通过改善中间信息检索性能来实现HotPotQA(全球环境)上的新艺术状态。

In this paper, we propose Cross-Thought, a novel approach to pre-training sequence encoder, which is instrumental in building reusable sequence embeddings for large-scale NLP tasks such as question answering. Instead of using the original signals of full sentences, we train a Transformer-based sequence encoder over a large set of short sequences, which allows the model to automatically select the most useful information for predicting masked words. Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders trained with continuous sentence signals as well as traditional masked language modeling baselines. Our proposed approach also achieves new state of the art on HotpotQA (full-wiki setting) by improving intermediate information retrieval performance.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源