论文标题
完美:通过语言模型及时无效的几次学习
PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models
论文作者
论文摘要
当前的几次识别蒙版语言模型(PLM)进行几次微调的方法需要精心设计的提示和每个新任务的言语,才能将示例转换为PLM可以得分的紧身式形式。在这项工作中,我们提出了一种完美的,一种简单有效的方法,用于几次对PLM进行微调,而无需依赖任何此类手工制作,这在32个数据点的情况下是非常有效的。 Perfect做出了两个关键的设计选择:首先,我们证明可以用特定于任务的适配器代替手动工程的任务提示,这些适配器可以分别通过大约5和100的因素来启用样本有效的微调和降低记忆和存储成本。其次,我们没有使用手工制作的言语,而是在微调过程中学习了新的多键标签嵌入,这些标签嵌入了模型词汇,这使我们能够避免复杂的自动回归解码。这些嵌入不仅可以从有限的数据中学习,而且还可以使近100倍的训练和推断能够学习。在各种少数NLP任务上进行的实验表明,完美,虽然简单有效,但也表现出了现有的最新学习方法的表现。我们的代码可在https://github.com/facebookresearch/perfect.git上公开获取。
Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score. In this work, we propose PERFECT, a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting, which is highly effective given as few as 32 data points. PERFECT makes two key design choices: First, we show that manually engineered task prompts can be replaced with task-specific adapters that enable sample-efficient fine-tuning and reduce memory and storage costs by roughly factors of 5 and 100, respectively. Second, instead of using handcrafted verbalizers, we learn new multi-token label embeddings during fine-tuning, which are not tied to the model vocabulary and which allow us to avoid complex auto-regressive decoding. These embeddings are not only learnable from limited data but also enable nearly 100x faster training and inference. Experiments on a wide range of few-shot NLP tasks demonstrate that PERFECT, while being simple and efficient, also outperforms existing state-of-the-art few-shot learning methods. Our code is publicly available at https://github.com/facebookresearch/perfect.git.