论文标题
关于概念化在常识知识图中的作用
On the Role of Conceptualization in Commonsense Knowledge Graph Construction
论文作者
论文摘要
常识性知识图(CKGS)和ASER等与常规KG相当不同,因为它们由松散结构的文本形成的大量节点组成,但是,这使它们能够处理与Commonsense相关的自然语言中高度多样的查询,这对自动KG构建方法带来了独特的挑战。除了确定节点之间缺乏的关系外,此类方法还有望探索以文本表示的缺失节点,其中可能会出现不同的现实世界或实体。为了处理现实世界中与常识有关的无数实体,我们介绍了CKG构建方法概念化,即将文本中提到的实体视为特定概念的实例,反之亦然。我们通过概念化构建合成三元组,并进一步将任务作为三重分类,由歧视性模型处理,并从验证的语言模型中传递了知识,并通过负抽样进行了微调。实验表明,我们的方法可以有效地识别出合理的三元组,并通过新的节点和高度多样性和新颖性的边缘的三倍扩展Kg。
Commonsense knowledge graphs (CKGs) like Atomic and ASER are substantially different from conventional KGs as they consist of much larger number of nodes formed by loosely-structured text, which, though, enables them to handle highly diverse queries in natural language related to commonsense, leads to unique challenges for automatic KG construction methods. Besides identifying relations absent from the KG between nodes, such methods are also expected to explore absent nodes represented by text, in which different real-world things, or entities, may appear. To deal with the innumerable entities involved with commonsense in the real world, we introduce to CKG construction methods conceptualization, i.e., to view entities mentioned in text as instances of specific concepts or vice versa. We build synthetic triples by conceptualization, and further formulate the task as triple classification, handled by a discriminatory model with knowledge transferred from pretrained language models and fine-tuned by negative sampling. Experiments demonstrate that our methods can effectively identify plausible triples and expand the KG by triples of both new nodes and edges of high diversity and novelty.