论文标题
单词频率无法预测语言模型中的语法知识
Word Frequency Does Not Predict Grammatical Knowledge in Language Models
论文作者
论文摘要
神经语言模型以不同程度的精度学习自然语言的语法特性。在这项工作中,我们研究了语言模型准确性是否有系统的差异来源。专注于主题驱动器的一致性和反思性图谱,我们发现某些名词在系统上比其他名词更好地理解了,这种效果在语法任务和不同的语言模型之间是可靠的。令人惊讶的是,我们发现在四个数量级,语料库频率与名词在语法任务上的性能无关。最后,我们发现一种新颖的名词的语法特性可以是从各种类型的培训数据中学到的几乎没有学到的。结果表现出悖论:语法性能的变化应比实际观察到的差异要少。
Neural language models learn, to varying degrees of accuracy, the grammatical properties of natural languages. In this work, we investigate whether there are systematic sources of variation in the language models' accuracy. Focusing on subject-verb agreement and reflexive anaphora, we find that certain nouns are systematically understood better than others, an effect which is robust across grammatical tasks and different language models. Surprisingly, we find that across four orders of magnitude, corpus frequency is unrelated to a noun's performance on grammatical tasks. Finally, we find that a novel noun's grammatical properties can be few-shot learned from various types of training data. The results present a paradox: there should be less variation in grammatical performance than is actually observed.