通过语法错误检测来探测目标句法知识

论文标题

通过语法错误检测来探测目标句法知识

Probing for targeted syntactic knowledge through grammatical error detection

论文作者

Davis, Christopher, Bryant, Christopher, Caines, Andrew, Rei, Marek, Buttery, Paula

论文摘要

针对主题驱动因素一致性（SVA）知识的目标研究表明，预训练的语言模型编码句法信息。我们断言，如果模型强符合主题 - 动词协议，则应能够确定何时正确的协议以及何时不正确。为此，我们提出语法错误检测作为诊断探针，以评估令牌级的上下文表示，以了解其对SVA的了解。我们从五个预训练的英语模型中评估每一层的上下文表示：Bert，XLNet，GPT-2，Roberta和Electra。我们利用英语第二语言学习者和Wikipedia编辑的公开注释的培训数据，并报告有关主题驱动程序协议的手动制作刺激的结果。我们发现，蒙版的语言模型线性编码与检测SVA错误相关的信息，而自回归模型则与我们的基线相当。但是，当对不同的训练集对探针进行培训时，我们还观察到性能的差异，并且在对不同的句法结构进行评估时，表明与SVA错误检测有关的信息没有牢固地编码。

Targeted studies testing knowledge of subject-verb agreement (SVA) indicate that pre-trained language models encode syntactic information. We assert that if models robustly encode subject-verb agreement, they should be able to identify when agreement is correct and when it is incorrect. To that end, we propose grammatical error detection as a diagnostic probe to evaluate token-level contextual representations for their knowledge of SVA. We evaluate contextual representations at each layer from five pre-trained English language models: BERT, XLNet, GPT-2, RoBERTa, and ELECTRA. We leverage public annotated training data from both English second language learners and Wikipedia edits, and report results on manually crafted stimuli for subject-verb agreement. We find that masked language models linearly encode information relevant to the detection of SVA errors, while the autoregressive models perform on par with our baseline. However, we also observe a divergence in performance when probes are trained on different training sets, and when they are evaluated on different syntactic constructions, suggesting the information pertaining to SVA error detection is not robustly encoded.

下载PDF全文

下载文献需遵守相关版权规定

论文标题