用于语言模型的黑盒调整-AS-A-Service

论文标题

用于语言模型的黑盒调整-AS-A-Service

Black-Box Tuning for Language-Model-as-a-Service

论文作者

Sun, Tianxiang, Shao, Yunfan, Qian, Hong, Huang, Xuanjing, Qiu, Xipeng

论文摘要

通常以服务发行了极大的预训练的语言模型（PTM），例如GPT-3。它允许用户设计特定于任务的提示，以通过某些Black-Box API查询PTM。在这种情况下，我们将其称为语言模式为AS-A-Service（LMAAS），PTM的梯度通常不可用。我们可以仅访问模型推理API来优化任务提示吗？本文提出了黑框调音框架，以通过无衍生的优化优化输入文本的连续提示。由于大型PTM的固有维度较低，因此我们在原始的高维及时及时空间中进行了优化，而不是在传统的无衍生产品优化方面进行优化，而是在随机生成的子空间中进行优化。实验结果表明，与罗伯塔（Roberta）进行的黑盒调整在一些标签样品中不仅显着胜过手动提示和GPT-3的中文学习，而且还超过了基于梯度的对应物，即及时调整和完整的模型调整。

Extremely large pre-trained language models (PTMs) such as GPT-3 are usually released as a service. It allows users to design task-specific prompts to query the PTMs through some black-box APIs. In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable. Can we optimize the task prompts by only accessing the model inference APIs? This paper proposes the black-box tuning framework to optimize the continuous prompt prepended to the input text via derivative-free optimization. Instead of optimizing in the original high-dimensional prompt space, which is intractable for traditional derivative-free optimization, we perform optimization in a randomly generated subspace due to the low intrinsic dimensionality of large PTMs. The experimental results show that the black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3's in-context learning, but also surpasses the gradient-based counterparts, i.e., prompt tuning and full model tuning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题