论文标题
多域文本挖掘的元微调神经语言模型
Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining
论文作者
论文摘要
预先训练的神经语言模型通过在特定于任务的训练集上微调模型来为各种NLP任务带来重大改进。在微调过程中,参数是直接从预训练的模型中初始化的,该模型忽略了不同域中类似NLP任务的学习过程如何相关并相互加强。在本文中,我们提出了一个名为Meta微调(MFT)的有效学习程序,用作元学习者,以解决一组针对神经语言模型的类似NLP任务。 MFT不仅仅是在所有数据集上进行多任务培训,而仅从各个领域的典型实例中学习才能获取高度可转移的知识。它进一步鼓励语言模型通过优化一系列新颖的域损失函数来编码域不变表示。 MFT之后,可以对每个域的模型进行微调,并具有更好的参数初始化和更高的概括能力。我们在BERT上实施MFT来解决几个多域文本挖掘任务。实验结果证实了MFT的有效性及其在几次学习中的有用性。
Pre-trained neural language models bring significant improvement for various NLP tasks, by fine-tuning the models on task-specific training sets. During fine-tuning, the parameters are initialized from pre-trained models directly, which ignores how the learning process of similar NLP tasks in different domains is correlated and mutually reinforced. In this paper, we propose an effective learning procedure named Meta Fine-Tuning (MFT), served as a meta-learner to solve a group of similar NLP tasks for neural language models. Instead of simply multi-task training over all the datasets, MFT only learns from typical instances of various domains to acquire highly transferable knowledge. It further encourages the language model to encode domain-invariant representations by optimizing a series of novel domain corruption loss functions. After MFT, the model can be fine-tuned for each domain with better parameter initializations and higher generalization ability. We implement MFT upon BERT to solve several multi-domain text mining tasks. Experimental results confirm the effectiveness of MFT and its usefulness for few-shot learning.