论文标题
利用Parsbert和Persian Appian抽象文本摘要的MT5预处理
Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization
论文作者
论文摘要
文本摘要是最重要的自然语言处理(NLP)任务之一。每天在该领域进行越来越多的研究。基于预训练的变压器编码器模型已开始对这些任务获得普及。本文提出了两种解决此任务的方法,并介绍了一个名为PN-Summary的新型数据集用于波斯抽象文本摘要。本文采用的模型是MT5和Parsbert模型的编码器版本(即波斯语的单语BERT模型)。这些模型在PN-Summary数据集上进行了微调。当前的工作是同类工作中的第一项,通过实现有希望的结果,可以作为未来工作的基准。
Text summarization is one of the most critical Natural Language Processing (NLP) tasks. More and more researches are conducted in this field every day. Pre-trained transformer-based encoder-decoder models have begun to gain popularity for these tasks. This paper proposes two methods to address this task and introduces a novel dataset named pn-summary for Persian abstractive text summarization. The models employed in this paper are mT5 and an encoder-decoder version of the ParsBERT model (i.e., a monolingual BERT model for Persian). These models are fine-tuned on the pn-summary dataset. The current work is the first of its kind and, by achieving promising results, can serve as a baseline for any future work.