语言模型和大脑由于下一字预测和文字级信息而对齐

论文标题

语言模型和大脑由于下一字预测和文字级信息而对齐

Language models and brains align due to more than next-word prediction and word-level information

论文作者

Merlin, Gabriele, Toneva, Mariya

论文摘要

验证的语言模型已被证明可以显着预测理解语言的人的大脑记录。最近的工作表明，下一个单词的预测是有助于这种对齐的关键机制。尚不清楚的是，对于这种观察到的对齐方式，对下一个单词的预测是否是必需的，还是仅仅是足够的，以及是否还有其他共同的机制或信息同样重要。在这项工作中，我们迈出了一步，通过在流行预审前的语言模型中通过两个简单的扰动来理解大脑对齐的原因。这些扰动有助于我们设计可以控制不同类型信息的对比。通过将这些不同扰动模型的大脑对齐方式进行对比，我们表明，与大脑记录的一致性的改善不仅仅是下一字预测和单词级别信息的改进所致。

Pretrained language models have been shown to significantly predict brain recordings of people comprehending language. Recent work suggests that the prediction of the next word is a key mechanism that contributes to this alignment. What is not yet understood is whether prediction of the next word is necessary for this observed alignment or simply sufficient, and whether there are other shared mechanisms or information that are similarly important. In this work, we take a step towards understanding the reasons for brain alignment via two simple perturbations in popular pretrained language models. These perturbations help us design contrasts that can control for different types of information. By contrasting the brain alignment of these differently perturbed models, we show that improvements in alignment with brain recordings are due to more than improvements in next-word prediction and word-level information.

下载PDF全文

下载文献需遵守相关版权规定

论文标题