论文标题

利用Parsbert和Persian Appian抽象文本摘要的MT5预处理

Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization

论文作者

Farahani, Mehrdad, Gharachorloo, Mohammad, Manthouri, Mohammad

论文摘要

文本摘要是最重要的自然语言处理(NLP)任务之一。每天在该领域进行越来越多的研究。基于预训练的变压器编码器模型已开始对这些任务获得普及。本文提出了两种解决此任务的方法,并介绍了一个名为PN-Summary的新型数据集用于波斯抽象文本摘要。本文采用的模型是MT5和Parsbert模型的编码器版本(即波斯语的单语BERT模型)。这些模型在PN-Summary数据集上进行了微调。当前的工作是同类工作中的第一项,通过实现有希望的结果,可以作为未来工作的基准。

Text summarization is one of the most critical Natural Language Processing (NLP) tasks. More and more researches are conducted in this field every day. Pre-trained transformer-based encoder-decoder models have begun to gain popularity for these tasks. This paper proposes two methods to address this task and introduces a novel dataset named pn-summary for Persian abstractive text summarization. The models employed in this paper are mT5 and an encoder-decoder version of the ParsBERT model (i.e., a monolingual BERT model for Persian). These models are fine-tuned on the pn-summary dataset. The current work is the first of its kind and, by achieving promising results, can serve as a baseline for any future work.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源