用感官级别的精度和扩展的词汇探测预训练的语言模型中的常识知识

论文标题

用感官级别的精度和扩展的词汇探测预训练的语言模型中的常识知识

Probing Commonsense Knowledge in Pre-trained Language Models with Sense-level Precision and Expanded Vocabulary

论文作者

Loureiro, Daniel, Jorge, Alípio Mário

论文摘要

常识性推理的进展通常是根据旨在需要常识知识的问题回答任务的绩效改进来衡量的。但是，对这些特定任务的大型语言模型（LMS）并未直接评估预训练期间学到的常识。对预训练的LMS中常识性知识的最直接评估可以说是针对常识性主张的固定任务（例如，使用笔用于[mask]。）。但是，这种方法受LM的词汇量的限制，可用于掩盖预测，其精度受主张提供的上下文的约束。在这项工作中，我们提出了一种在词汇层可用的扎根库存（即WordNet）中丰富LMS的方法，而无需进一步的培训。这种修改将披肩风格的预测空间提示到大本体的大小，同时实现了较细粒度的（感官级别的）查询和预测。为了用更高的精度评估LMS，我们提出了Senselama，这是一项固定的任务，其中包含来自Wordnet，Wikidata和Conceptnet的歧义的三元组中的口头关系。将我们的方法应用于BERT，生产一个名为Synbert的富含WordNet的版本，我们发现LMS可以从自学意义中学习非平凡的常识性知识，涵盖了许多关系，并且比基于相似的基于相似性的方法更有效。

Progress on commonsense reasoning is usually measured from performance improvements on Question Answering tasks designed to require commonsense knowledge. However, fine-tuning large Language Models (LMs) on these specific tasks does not directly evaluate commonsense learned during pre-training. The most direct assessments of commonsense knowledge in pre-trained LMs are arguably cloze-style tasks targeting commonsense assertions (e.g., A pen is used for [MASK].). However, this approach is restricted by the LM's vocabulary available for masked predictions, and its precision is subject to the context provided by the assertion. In this work, we present a method for enriching LMs with a grounded sense inventory (i.e., WordNet) available at the vocabulary level, without further training. This modification augments the prediction space of cloze-style prompts to the size of a large ontology while enabling finer-grained (sense-level) queries and predictions. In order to evaluate LMs with higher precision, we propose SenseLAMA, a cloze-style task featuring verbalized relations from disambiguated triples sourced from WordNet, WikiData, and ConceptNet. Applying our method to BERT, producing a WordNet-enriched version named SynBERT, we find that LMs can learn non-trivial commonsense knowledge from self-supervision, covering numerous relations, and more effectively than comparable similarity-based approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题