论文标题

机器知识:综合知识基础的创建和策划

Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases

论文作者

Weikum, Gerhard, Dong, Luna, Razniewski, Simon, Suchanek, Fabian

论文摘要

为机器提供了对世界实体及其关系的全面知识,这是AI的长期目标。在过去的十年中,大规模的知识库(也称为知识图)是自动从Web内容和文本来源构建的,并已成为搜索引擎的关键资产。可以利用此机器知识来解释新闻,社交媒体和网络表中的文本短语,并为答案,自然语言处理和数据分析做出贡献。本文调查了创建和策划大型知识基础的基本概念和实用方法。它涵盖了发现和规范化实体及其语义类型的模型和方法,并将其组织成干净的分类法。最重要的是,本文讨论了以实体为中心的属性的自动提取。为了支持长期的生命周期和机器知识的质量保证,本文介绍了构建开放式模式和知识策划的方法。关于学术项目和工业知识图的案例研究补充了概念和方法的调查。

Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines. This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics. This article surveys fundamental concepts and practical methods for creating and curating large knowledge bases. It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies. On top of this, the article discusses the automatic extraction of entity-centric properties. To support the long-term life-cycle and the quality assurance of machine knowledge, the article presents methods for constructing open schemas and for knowledge curation. Case studies on academic projects and industrial knowledge graphs complement the survey of concepts and methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源