我们如何到达那里？评估变压器神经网络作为英语过去时弯曲的认知模型

论文标题

我们如何到达那里？评估变压器神经网络作为英语过去时弯曲的认知模型

How do we get there? Evaluating transformer neural networks as cognitive models for English past tense inflection

论文作者

Ma, Xiaomeng, Gao, Lingyu

论文摘要

关于神经网络是否可以掌握像人类这样的语言中的准常规性，存在着持续的辩论。在一项典型的准常规任务中，英语过去时态变形，长期以来，神经网络模型一直批评它仅是为了概括最常见的模式，但不能概括常规模式，因此无法学习常规和狂热的抽象类别，并且与人类的表现不同。在这项工作中，我们培训了一组具有不同设置的变压器模型，以检查其在此任务上的行为。这些模型在看不见的普通动词方面具有很高的精度，并且在看不见的不规则动词方面具有一些精度。模型在常规群体上的性能受到类型频率和比率的严重影响，但不为代币的频率和比率，反之亦然。常规和不规则的不同行为表明，模型对动词的规律性具有一定程度的象征性学习。另外，模型与人类在非CE动词上的行为较弱。尽管变压器模型在动词规律性的抽象类别上表现出一定程度的学习水平，但其性能并不能很好地符合人类数据，这表明它可能不是一个良好的认知模型。

There is an ongoing debate on whether neural networks can grasp the quasi-regularities in languages like humans. In a typical quasi-regularity task, English past tense inflections, the neural network model has long been criticized that it learns only to generalize the most frequent pattern, but not the regular pattern, thus can not learn the abstract categories of regular and irregular and is dissimilar to human performance. In this work, we train a set of transformer models with different settings to examine their behavior on this task. The models achieved high accuracy on unseen regular verbs and some accuracy on unseen irregular verbs. The models' performance on the regulars is heavily affected by type frequency and ratio but not token frequency and ratio, and vice versa for the irregulars. The different behaviors on the regulars and irregulars suggest that the models have some degree of symbolic learning on the regularity of the verbs. In addition, the models are weakly correlated with human behavior on nonce verbs. Although the transformer model exhibits some level of learning on the abstract category of verb regularity, its performance does not fit human data well, suggesting that it might not be a good cognitive model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题