论文标题
通过中间微调和数据增强,改善零和几乎没有抽象的摘要
Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation
论文作者
论文摘要
在大型文本语料库上以自我监督的目标预审到的模型在英语文本摘要任务上实现了最先进的绩效。但是,这些模型通常在数十万个数据点上进行微调,这是将摘要应用于新的,利基领域时的不可行的要求。在这项工作中,我们介绍了一种新颖且可推广的方法,称为WikiTransfer,以微调审计的模型,以一种无监督的,特定于数据集的方式进行摘要。 Wikitransfer通过通用Wikipedia数据产生的伪夏令的微调模型,其中包含目标数据集的特征,例如所需摘要的抽象的长度和水平。 Wikitransfer模型在CNN-DailMail数据集上实现了最先进的,零拍的抽象摘要性能,并证明了我们方法对三个其他不同数据集的有效性。与其他摘要数据集相比,这些模型对嘈杂的数据更适合嘈杂数据,并且使用10和100个培训示例获得了10和100个训练示例。为了进一步提高性能,我们通过往返翻译采用数据增强,并引入了一个正规化术语,以改善几次转移。为了了解数据集方面在传输性能中的作用以及所得的输出摘要的质量,我们进一步研究了我们无监督的微调数据的组成部分的效果,并使用自动和人类评估均分析了很少的表现。
Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a novel and generalizable method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner. WikiTransfer fine-tunes pretrained models on pseudo-summaries, produced from generic Wikipedia data, which contain characteristics of the target dataset, such as the length and level of abstraction of the desired summaries. WikiTransfer models achieve state-of-the-art, zero-shot abstractive summarization performance on the CNN-DailyMail dataset and demonstrate the effectiveness of our approach on three additional diverse datasets. These models are more robust to noisy data and also achieve better or comparable few-shot performance using 10 and 100 training examples when compared to few-shot transfer from other summarization datasets. To further boost performance, we employ data augmentation via round-trip translation as well as introduce a regularization term for improved few-shot transfer. To understand the role of dataset aspects in transfer performance and the quality of the resulting output summaries, we further study the effect of the components of our unsupervised fine-tuning data and analyze few-shot performance using both automatic and human evaluation.