提示的线性探测：扩展超过几射击中的限制学习者

论文标题

提示的线性探测：扩展超过几射击中的限制学习者

Prompt-Augmented Linear Probing: Scaling beyond the Limit of Few-shot In-Context Learners

论文作者

Cho, Hyunsoo, Kim, Hyuhng Joon, Kim, Junyeob, Lee, Sang-Woo, Lee, Sang-goo, Yoo, Kang Min, Kim, Taeuk

论文摘要

通过文化学习（ICL），大规模的语言模型是有效的几次学习者，而没有其他模型进行微调。但是，ICL性能与可用培训样本的数量不佳，因为它受到基础语言模型的固有输入长度约束的限制。同时，许多研究表明，语言模型也是功能强大的功能提取器，可以以黑盒方式使用它们并启用线性探测范式，在该探测范式中，轻质歧视器在预提取输入表示的顶部进行了训练。本文提出了及时提升线性探测（PALP），这是一种线性探测和ICL的混合体，它利用了两全其美。 PALP继承了线性探测的可扩展性以及强制使用语言模型通过将输入量身定制为更具想象形式的更有意义的表示的能力。在对各种数据集的深入研究中，我们验证了PALP可以显着增强输入表示，从而在数据丰富的方案中缩小ICL之间的差距，并在数据丰富的情况下进行微调，而很少有培训架空开销，这可能使PALP在黑色盒子的情况下成为强有力的选择。

Through in-context learning (ICL), large-scale language models are effective few-shot learners without additional model fine-tuning. However, the ICL performance does not scale well with the number of available training samples as it is limited by the inherent input length constraint of the underlying language model. Meanwhile, many studies have revealed that language models are also powerful feature extractors, allowing them to be utilized in a black-box manner and enabling the linear probing paradigm, where lightweight discriminators are trained on top of the pre-extracted input representations. This paper proposes prompt-augmented linear probing (PALP), a hybrid of linear probing and ICL, which leverages the best of both worlds. PALP inherits the scalability of linear probing and the capability of enforcing language models to derive more meaningful representations via tailoring input into a more conceivable form. Throughout in-depth investigations on various datasets, we verified that PALP significantly enhances the input representations closing the gap between ICL in the data-hungry scenario and fine-tuning in the data-abundant scenario with little training overhead, potentially making PALP a strong alternative in a black-box scenario.

下载PDF全文

下载文献需遵守相关版权规定

论文标题