不连续NER的有效基于过渡的模型

论文标题

不连续NER的有效基于过渡的模型

An Effective Transition-based Model for Discontinuous NER

论文作者

Dai, Xiang, Karimi, Sarvnaz, Hachey, Ben, Paris, Cecile

论文摘要

与通用域中广泛使用的命名实体识别（NER）数据集不同，生物医学NER数据集通常包含由不连续跨度组成的提及。常规序列标记技术编码有效但排除这些提及的恢复的马尔可夫假设。我们提出了一个简单，有效的基于过渡的模型，具有通用神经编码的不连续NER。通过对三个生物医学数据集的广泛实验，我们表明我们的模型可以有效地识别不连续的提及而不会牺牲连续提及的准确性。

Unlike widely used Named Entity Recognition (NER) data sets in generic domains, biomedical NER data sets often contain mentions consisting of discontinuous spans. Conventional sequence tagging techniques encode Markov assumptions that are efficient but preclude recovery of these mentions. We propose a simple, effective transition-based model with generic neural encoding for discontinuous NER. Through extensive experiments on three biomedical data sets, we show that our model can effectively recognize discontinuous mentions without sacrificing the accuracy on continuous mentions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题