Factpegasus：事实意识到的预培训和抽象性摘要的微调

论文标题

Factpegasus：事实意识到的预培训和抽象性摘要的微调

FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization

论文作者

Wan, David, Bansal, Mohit

论文摘要

我们提出了Factpegasus，这是一个抽象性摘要模型，该模型解决了在培训和微调过程中的事实问题：（1）我们增强了Pegasus's（Zhang等，2020）的句子选择策略，以创建伪造的目标，以创建伪造的伪造 - 既重要又是事实；（2）我们介绍了三个互补组件以进行微调。校正器删除了参考摘要中存在的幻觉，对比度使用对比度学习更好地区分非事实摘要与事实的摘要，而连接器弥合了预训练和微调之间的差距，以更好地转移知识。对三个下游任务的实验表明，FactPegasus大大改善了通过多个自动指标和人类评估的事实。我们的详尽分析表明，事实比格萨斯比在零射门和几乎没有射击的设置中使用原始训练目标更为事实，比强大的基本线更坚定地保留事实行为，并且并不完全依赖更依赖于提高事实的扎根。我们的代码和数据可公开可用：https：//github.com/meetdavidwan/factpegasus

We present FactPEGASUS, an abstractive summarization model that addresses the problem of factuality during pre-training and fine-tuning: (1) We augment the sentence selection strategy of PEGASUS's (Zhang et al., 2020) pre-training objective to create pseudo-summaries that are both important and factual; (2) We introduce three complementary components for fine-tuning. The corrector removes hallucinations present in the reference summary, the contrastor uses contrastive learning to better differentiate nonfactual summaries from factual ones, and the connector bridges the gap between the pre-training and fine-tuning for better transfer of knowledge. Experiments on three downstream tasks demonstrate that FactPEGASUS substantially improves factuality evaluated by multiple automatic metrics and humans. Our thorough analysis suggests that FactPEGASUS is more factual than using the original pre-training objective in zero-shot and few-shot settings, retains factual behavior more robustly than strong baselines, and does not rely entirely on becoming more extractive to improve factuality. Our code and data are publicly available at: https://github.com/meetdavidwan/factpegasus

下载PDF全文

下载文献需遵守相关版权规定

论文标题