通过文本到文本模型评估波兰的转移学习

论文标题

通过文本到文本模型评估波兰的转移学习

Evaluation of Transfer Learning for Polish with a Text-to-Text Model

论文作者

Chrabrowa, Aleksandra, Dragan, Łukasz, Grzegorczyk, Karol, Kajtoch, Dariusz, Koszowski, Mikołaj, Mroczkowski, Robert, Rybak, Piotr

论文摘要

我们介绍了一种新的基准测试，以评估波兰文本到文本模型的质量。该基准由不同的任务和数据集组成：KLEJ基准测试，适用于文本到文本，EN-PL翻译，摘要和问题答案。特别是，由于摘要和问题答案缺乏波兰语言的基准数据集，因此我们描述了它们的结构并使它们公开可用。此外，我们提出PLT5-一种通用的波兰语文本到文本模型，可以在各种自然语言处理（NLP）任务上进行微调，并具有单个培训目标。无监督的DENOCONING预训练是通过用多语言T5（MT5）对应物初始化模型权重有效地执行的。我们评估PLT5，MT5，波兰BART（PLBART）和波兰GPT-2（Papugapt2）的性能。 PLT5在所有这些任务上得分最高，除了PLBART最好的摘要。通常（摘要除外），模型越大，结果越好。编码器架构被证明比仅解码器等效的更好。

We introduce a new benchmark for assessing the quality of text-to-text models for Polish. The benchmark consists of diverse tasks and datasets: KLEJ benchmark adapted for text-to-text, en-pl translation, summarization, and question answering. In particular, since summarization and question answering lack benchmark datasets for the Polish language, we describe their construction and make them publicly available. Additionally, we present plT5 - a general-purpose text-to-text model for Polish that can be fine-tuned on various Natural Language Processing (NLP) tasks with a single training objective. Unsupervised denoising pre-training is performed efficiently by initializing the model weights with a multi-lingual T5 (mT5) counterpart. We evaluate the performance of plT5, mT5, Polish BART (plBART), and Polish GPT-2 (papuGaPT2). The plT5 scores top on all of these tasks except summarization, where plBART is best. In general (except for summarization), the larger the model, the better the results. The encoder-decoder architectures prove to be better than the decoder-only equivalent.

下载PDF全文

下载文献需遵守相关版权规定

论文标题