论文标题

探测语法数字的使用

Probing for the Usage of Grammatical Number

论文作者

Lasri, Karim, Pimentel, Tiago, Lenci, Alessandro, Poibeau, Thierry, Cotterell, Ryan

论文摘要

对探测的一个核心追求是发现预训练的模型如何在其表示形式中编码语言特性。但是,编码可能是虚假的,即该模型在做出预测时可能不依赖它。在本文中,我们尝试找到该模型实际使用的编码,并引入了基于用法的探测设置。我们首先选择一个行为任务,如果不使用语言属性,该任务无法解决。然后,我们尝试通过介入模型的表示来删除属性。我们认为,如果模型使用编码,则其去除将损害所选行为任务的性能。作为案例研究,我们关注Bert如何编码语法数字,以及如何使用此编码来解决数字协议任务。在实验上,我们发现BERT依赖于语法数字的线性编码来产生正确的行为输出。我们还发现,伯特对名词和动词的语法编号进行了单独的编码。最后,我们确定在哪些层中有关语法数字的信息从名词转移到其头动词。

A central quest of probing is to uncover how pre-trained models encode a linguistic property within their representations. An encoding, however, might be spurious-i.e., the model might not rely on it when making predictions. In this paper, we try to find encodings that the model actually uses, introducing a usage-based probing setup. We first choose a behavioral task which cannot be solved without using the linguistic property. Then, we attempt to remove the property by intervening on the model's representations. We contend that, if an encoding is used by the model, its removal should harm the performance on the chosen behavioral task. As a case study, we focus on how BERT encodes grammatical number, and on how it uses this encoding to solve the number agreement task. Experimentally, we find that BERT relies on a linear encoding of grammatical number to produce the correct behavioral output. We also find that BERT uses a separate encoding of grammatical number for nouns and verbs. Finally, we identify in which layers information about grammatical number is transferred from a noun to its head verb.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源