论文标题

关于文化学习的组成概括差距

On the Compositional Generalization Gap of In-Context Learning

论文作者

Hosseini, Arian, Vani, Ankit, Bahdanau, Dzmitry, Sordoni, Alessandro, Courville, Aaron

论文摘要

预处理的大生成语言模型在许多任务上表现出色,但表现出较低的组成概括能力。缩放此类模型已被证明可以通过在几个示例中调节一些示例来解决该任务,而无需进行任何微调(也称为封闭式学习),就可以提高其在各种NLP任务上的性能。在这项工作中,我们查看了分布式(ID)和分布(OOD)在具有文化中的语义解析任务中的差距(OOD)之间的差距。在ID设置中,演示来自正在评估模型的同一拆分(测试或列车),在OOD设置中,它们来自另一个拆分。我们查看随着模型的扩展,对文化学习的相对概括差距如何发展。我们在三个语义解析数据集上评估了四个模型家族,即Opt,Bloom,Codegen和codex,CFQ,扫描和地理位置具有不同数量的示例,并观察到随着模型的扩展,相对概括性差距降低了相对概括的差距。

Pretrained large generative language models have shown great performance on many tasks, but exhibit low compositional generalization abilities. Scaling such models has been shown to improve their performance on various NLP tasks even just by conditioning them on a few examples to solve the task without any fine-tuning (also known as in-context learning). In this work, we look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning. In the ID settings, the demonstrations are from the same split (test or train) that the model is being evaluated on, and in the OOD settings, they are from the other split. We look at how the relative generalization gap of in-context learning evolves as models are scaled up. We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets, CFQ, SCAN and GeoQuery with different number of exemplars, and observe a trend of decreasing relative generalization gap as models are scaled up.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源