重新访问中国自然语言处理的预培训模型

论文标题

重新访问中国自然语言处理的预培训模型

Revisiting Pre-Trained Models for Chinese Natural Language Processing

论文作者

Cui, Yiming, Che, Wanxiang, Liu, Ting, Qin, Bing, Wang, Shijin, Hu, Guoping

论文摘要

来自变形金刚（BERT）的双向编码器表示已显示出各种NLP任务的奇妙改进，并且已经提出了连续的变体，以进一步提高预训练的语言模型的性能。在本文中，我们针对重新审查中国预培训的语言模型，以检查其在非英语语言中的有效性，并将中国预训练的语言模型系列发布给社区。我们还提出了一个名为Macbert的简单但有效的模型，该模型以多种方式改进了罗伯塔，尤其是采用MLM作为校正的掩蔽策略（MAC）。我们对八项中国NLP任务进行了广泛的实验，以重新审视现有的预训练的语言模型以及拟议的Macbert。实验结果表明，Macbert可以在许多NLP任务上实现最先进的表现，我们还通过几个发现可能有助于将来的研究来消融细节。可用的资源：https：//github.com/ymcui/macbert

Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks, and consecutive variants have been proposed to further improve the performance of the pre-trained language models. In this paper, we target on revisiting Chinese pre-trained language models to examine their effectiveness in a non-English language and release the Chinese pre-trained language model series to the community. We also propose a simple but effective model called MacBERT, which improves upon RoBERTa in several ways, especially the masking strategy that adopts MLM as correction (Mac). We carried out extensive experiments on eight Chinese NLP tasks to revisit the existing pre-trained language models as well as the proposed MacBERT. Experimental results show that MacBERT could achieve state-of-the-art performances on many NLP tasks, and we also ablate details with several findings that may help future research. Resources available: https://github.com/ymcui/MacBERT

下载PDF全文

下载文献需遵守相关版权规定

论文标题