论文标题

TPLINKER:通过令牌对链接的单级关节提取和关系

TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking

论文作者

Wang, Yucheng, Yu, Bowen, Zhang, Yueyang, Liu, Tingwen, Zhu, Hongsong, Sun, Limin

论文摘要

近年来,从非结构化文本中提取实体和关系引起了人们的关注,但由于识别与共享实体的重叠关系的内在困难,因此仍然具有挑战性。先前的工作表明,联合学习可能会导致明显的表现增长。但是,它们通常涉及连续相互关联的步骤,并遭受暴露偏见的问题。在培训时,他们在推断时预测地面真相条件,必须从头开始提取。这种差异导致错误积累。为了减轻该问题,我们在本文中提出了一个单阶段的关节提取模型,即Tplinker,它能够发现共享一个或两个实体的重叠关系,同时免受暴露偏见的影响。 Tplinker将关节提取作为令牌对链接问题的提取,并引入了一种新颖的握手标记方案,该方案将每个关系类型下实体对的边界令牌对齐。实验结果表明,Tplinker在重叠和多个关系提取方面的性能明显更好,并且在两个公共数据集上实现了最先进的性能。

Extracting entities and relations from unstructured text has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in identifying overlapping relations with shared entities. Prior works show that joint learning can result in a noticeable performance gain. However, they usually involve sequential interrelated steps and suffer from the problem of exposure bias. At training time, they predict with the ground truth conditions while at inference it has to make extraction from scratch. This discrepancy leads to error accumulation. To mitigate the issue, we propose in this paper a one-stage joint extraction model, namely, TPLinker, which is capable of discovering overlapping relations sharing one or both entities while immune from the exposure bias. TPLinker formulates joint extraction as a token pair linking problem and introduces a novel handshaking tagging scheme that aligns the boundary tokens of entity pairs under each relation type. Experiment results show that TPLinker performs significantly better on overlapping and multiple relation extraction, and achieves state-of-the-art performance on two public datasets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源