论文标题

触发者:通过实体触发器学习作为指定实体识别的解释

TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition

论文作者

Lin, Bill Yuchen, Lee, Dong-Ho, Shen, Ming, Moreno, Ryan, Huang, Xiao, Shiralkar, Prashant, Ren, Xiang

论文摘要

在新领域中训练针对实体识别(NER)的神经模型通常需要额外的人类注释(例如,成千上万的标记实例)通常昂贵且耗时的收集。因此,一个关键的研究问题是如何以具有成本效益的方式获得监督。在本文中,我们介绍了“实体触发器”,这是人类解释的有效代表,用于促进对NER模型的标签有效学习。实体触发器定义为句子中的一组单词,有助于解释为什么人类会在句子中识别一个实体。 我们为两个经过良好研究的NER数据集提供了众包14K实体触发器。我们提出的模型,触发匹配网络,共同学习具有自我发挥的触发表示形式和软匹配模块,以便可以概括地轻松地进行标记。我们的框架比传统的神经NER框架更具成本效益。实验表明,仅使用20%的触发声音句子会导致使用70%的常规注释句子的可比性能。

Training neural models for named entity recognition (NER) in a new domain often requires additional human annotations (e.g., tens of thousands of labeled instances) that are usually expensive and time-consuming to collect. Thus, a crucial research question is how to obtain supervision in a cost-effective way. In this paper, we introduce "entity triggers," an effective proxy of human explanations for facilitating label-efficient learning of NER models. An entity trigger is defined as a group of words in a sentence that helps to explain why humans would recognize an entity in the sentence. We crowd-sourced 14k entity triggers for two well-studied NER datasets. Our proposed model, Trigger Matching Network, jointly learns trigger representations and soft matching module with self-attention such that can generalize to unseen sentences easily for tagging. Our framework is significantly more cost-effective than the traditional neural NER frameworks. Experiments show that using only 20% of the trigger-annotated sentences results in a comparable performance as using 70% of conventional annotated sentences.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源