微弱监督的微调预训练的语言模型：一种对比规范化的自我训练方法

论文标题

微弱监督的微调预训练的语言模型：一种对比规范化的自我训练方法

Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach

论文作者

Yu, Yue, Zuo, Simiao, Jiang, Haoming, Ren, Wendi, Zhao, Tuo, Zhang, Chao

论文摘要

经过微调的预训练的语言模型（LMS）在许多自然语言处理（NLP）任务中取得了巨大的成功，但在微调阶段，它们仍然需要过多的标记数据。我们仅使用弱监督研究，研究预先训练的LMS的问题，而没有任何标记的数据。这个问题是具有挑战性的，因为LMS的高容量使它们容易过度拟合弱监督产生的嘈杂标签。为了解决这个问题，我们开发了一个对比的自我训练框架余弦，以使LMS能够在弱监督下进行微调LM。受对比的正则化和基于置信的重新加权的基础，这种对比的自我训练框架可以逐渐改善模型拟合，同时有效地抑制误差传播。在序列，令牌和句子对分类任务上进行的实验表明，我们的模型在6个任务中的7个基准测试中优于最强的基线，并通过完全监督的微调方法实现了竞争性能。

Fine-tuned pre-trained language models (LMs) have achieved enormous success in many natural language processing (NLP) tasks, but they still require excessive labeled data in the fine-tuning stage. We study the problem of fine-tuning pre-trained LMs using only weak supervision, without any labeled data. This problem is challenging because the high capacity of LMs makes them prone to overfitting the noisy labels generated by weak supervision. To address this problem, we develop a contrastive self-training framework, COSINE, to enable fine-tuning LMs with weak supervision. Underpinned by contrastive regularization and confidence-based reweighting, this contrastive self-training framework can gradually improve model fitting while effectively suppressing error propagation. Experiments on sequence, token, and sentence pair classification tasks show that our model outperforms the strongest baseline by large margins on 7 benchmarks in 6 tasks, and achieves competitive performance with fully-supervised fine-tuning methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题