命名实体识别的并行实例查询网络

论文标题

命名实体识别的并行实例查询网络

Parallel Instance Query Network for Named Entity Recognition

论文作者

Shen, Yongliang, Wang, Xiaobin, Tan, Zeqi, Xu, Guangwei, Xie, Pengjun, Huang, Fei, Lu, Weiming, Zhuang, Yueting

论文摘要

命名实体识别（NER）是自然语言处理中的基本任务。最近的作品将名为“实体识别”视为阅读理解任务，手动构建特定于类型的查询以提取实体。这个范式遇到了三个问题。首先，特定于类型的查询只能每个推理提取一种类型的实体，这效率低下。其次，对不同类型实体的提取是孤立的，忽略了它们之间的依赖性。第三，查询构造依赖于外部知识，并且很难应用于具有数百种实体类型的现实情况。为了处理它们，我们提出了并行实例查询网络（PIQN），该网络（PIQN）设置了全局和可学习的实例查询，以并行方式从句子中提取实体。每个实例查询都可以预测一个实体，并且通过同时喂食所有实例查询，我们可以并行查询所有实体。实例查询不是从外部知识中构造的，而是可以在培训期间学习其不同的查询语义。为了训练模型，我们将标签分配视为一对多线性分配问题（LAP），并动态分配金实体，以最小的分配成本为实例查询。嵌套和扁平数据集的实验表明，我们所提出的方法的表现优于先前的最新模型。

Named entity recognition (NER) is a fundamental task in natural language processing. Recent works treat named entity recognition as a reading comprehension task, constructing type-specific queries manually to extract entities. This paradigm suffers from three issues. First, type-specific queries can only extract one type of entities per inference, which is inefficient. Second, the extraction for different types of entities is isolated, ignoring the dependencies between them. Third, query construction relies on external knowledge and is difficult to apply to realistic scenarios with hundreds of entity types. To deal with them, we propose Parallel Instance Query Network (PIQN), which sets up global and learnable instance queries to extract entities from a sentence in a parallel manner. Each instance query predicts one entity, and by feeding all instance queries simultaneously, we can query all entities in parallel. Instead of being constructed from external knowledge, instance queries can learn their different query semantics during training. For training the model, we treat label assignment as a one-to-many Linear Assignment Problem (LAP) and dynamically assign gold entities to instance queries with minimal assignment cost. Experiments on both nested and flat NER datasets demonstrate that our proposed method outperforms previous state-of-the-art models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题