完美：通过语言模型及时无效的几次学习

论文标题

完美：通过语言模型及时无效的几次学习

PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models

论文作者

Mahabadi, Rabeeh Karimi, Zettlemoyer, Luke, Henderson, James, Saeidi, Marzieh, Mathias, Lambert, Stoyanov, Veselin, Yazdani, Majid

论文摘要

当前的几次识别蒙版语言模型（PLM）进行几次微调的方法需要精心设计的提示和每个新任务的言语，才能将示例转换为PLM可以得分的紧身式形式。在这项工作中，我们提出了一种完美的，一种简单有效的方法，用于几次对PLM进行微调，而无需依赖任何此类手工制作，这在32个数据点的情况下是非常有效的。 Perfect做出了两个关键的设计选择：首先，我们证明可以用特定于任务的适配器代替手动工程的任务提示，这些适配器可以分别通过大约5和100的因素来启用样本有效的微调和降低记忆和存储成本。其次，我们没有使用手工制作的言语，而是在微调过程中学习了新的多键标签嵌入，这些标签嵌入了模型词汇，这使我们能够避免复杂的自动回归解码。这些嵌入不仅可以从有限的数据中学习，而且还可以使近100倍的训练和推断能够学习。在各种少数NLP任务上进行的实验表明，完美，虽然简单有效，但也表现出了现有的最新学习方法的表现。我们的代码可在https://github.com/facebookresearch/perfect.git上公开获取。

Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score. In this work, we propose PERFECT, a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting, which is highly effective given as few as 32 data points. PERFECT makes two key design choices: First, we show that manually engineered task prompts can be replaced with task-specific adapters that enable sample-efficient fine-tuning and reduce memory and storage costs by roughly factors of 5 and 100, respectively. Second, instead of using handcrafted verbalizers, we learn new multi-token label embeddings during fine-tuning, which are not tied to the model vocabulary and which allow us to avoid complex auto-regressive decoding. These embeddings are not only learnable from limited data but also enable nearly 100x faster training and inference. Experiments on a wide range of few-shot NLP tasks demonstrate that PERFECT, while being simple and efficient, also outperforms existing state-of-the-art few-shot learning methods. Our code is publicly available at https://github.com/facebookresearch/perfect.git.

下载PDF全文

下载文献需遵守相关版权规定

论文标题