触发者：通过实体触发器学习作为指定实体识别的解释

论文标题

触发者：通过实体触发器学习作为指定实体识别的解释

TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition

论文作者

Lin, Bill Yuchen, Lee, Dong-Ho, Shen, Ming, Moreno, Ryan, Huang, Xiao, Shiralkar, Prashant, Ren, Xiang

论文摘要

在新领域中训练针对实体识别（NER）的神经模型通常需要额外的人类注释（例如，成千上万的标记实例）通常昂贵且耗时的收集。因此，一个关键的研究问题是如何以具有成本效益的方式获得监督。在本文中，我们介绍了“实体触发器”，这是人类解释的有效代表，用于促进对NER模型的标签有效学习。实体触发器定义为句子中的一组单词，有助于解释为什么人类会在句子中识别一个实体。我们为两个经过良好研究的NER数据集提供了众包14K实体触发器。我们提出的模型，触发匹配网络，共同学习具有自我发挥的触发表示形式和软匹配模块，以便可以概括地轻松地进行标记。我们的框架比传统的神经NER框架更具成本效益。实验表明，仅使用20％的触发声音句子会导致使用70％的常规注释句子的可比性能。

Training neural models for named entity recognition (NER) in a new domain often requires additional human annotations (e.g., tens of thousands of labeled instances) that are usually expensive and time-consuming to collect. Thus, a crucial research question is how to obtain supervision in a cost-effective way. In this paper, we introduce "entity triggers," an effective proxy of human explanations for facilitating label-efficient learning of NER models. An entity trigger is defined as a group of words in a sentence that helps to explain why humans would recognize an entity in the sentence. We crowd-sourced 14k entity triggers for two well-studied NER datasets. Our proposed model, Trigger Matching Network, jointly learns trigger representations and soft matching module with self-attention such that can generalize to unseen sentences easily for tagging. Our framework is significantly more cost-effective than the traditional neural NER frameworks. Experiments show that using only 20% of the trigger-annotated sentences results in a comparable performance as using 70% of conventional annotated sentences.

下载PDF全文

下载文献需遵守相关版权规定

论文标题