MVP：自然语言生成的多任务监督预培训

论文标题

MVP：自然语言生成的多任务监督预培训

MVP: Multi-task Supervised Pre-training for Natural Language Generation

论文作者

Tang, Tianyi, Li, Junyi, Zhao, Wayne Xin, Wen, Ji-Rong

论文摘要

预训练的语言模型（PLM）在自然语言生成（NLG）任务方面取得了巨大的成功。到目前为止，大多数面向NLG的PLM都使用大型一般语料库以无监督的方式进行预训练。同时，与无监督的预培训模型相比，通过标记数据（即“监督的预训练”）预先培训的模型越来越出色。受监督预训练的成功的激励，我们提出了自然语言生成的多任务监督预训练（MVP）。我们收集了一个大规模的自然语言生态语料库MVPCORPU，从$ 77 $数据集超过$ 11 $多样化的NLG任务。然后，我们将这些示例统一为一般的文本到文本格式，以监督的方式预先培训文本生成模型MVP。对于每个任务，我们进一步预先培训特定的软提示，以刺激模型执行特定任务的能力。我们的MVP模型可以看作是一种利用相对较小的PLM的指令调整的实践。广泛的实验证明了我们在许多NLG任务中的MVP模型的有效性和一般性，这在$ 17 $的数据集中达到了最先进的绩效，在$ 17 $的数据集中获得了$ 9.3 \％\％$ $，而Flan-T5则以$ 5.8 \％$ $。

Pre-trained language models (PLMs) have achieved remarkable success in natural language generation (NLG) tasks. Up to now, most NLG-oriented PLMs are pre-trained in an unsupervised manner using the large-scale general corpus. In the meanwhile, an increasing number of models pre-trained with labeled data (i.e. "supervised pre-training") showcase superior performance compared to unsupervised pre-trained models. Motivated by the success of supervised pre-training, we propose Multi-task superVised Pre-training (MVP) for natural language generation. We collect a large-scale natural language generation corpus, MVPCorpus, from $77$ datasets over $11$ diverse NLG tasks. Then we unify these examples into a general text-to-text format to pre-train the text generation model MVP in a supervised manner. For each task, we further pre-train specific soft prompts to stimulate the model's capacity to perform a specific task. Our MVP model can be seen as a practice that utilizes recent instruction tuning on relatively small PLMs. Extensive experiments have demonstrated the effectiveness and generality of our MVP model in a number of NLG tasks, which achieves state-of-the-art performance on $13$ out of $17$ datasets, outperforming BART by $9.3\%$ and Flan-T5 by $5.8\%$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题