see-few：种子，扩展并带来几个命名的实体识别

论文标题

see-few：种子，扩展并带来几个命名的实体识别

SEE-Few: Seed, Expand and Entail for Few-shot Named Entity Recognition

论文作者

Yang, Zeng, Zhang, Linhai, Zhou, Deyu

论文摘要

几个名称实体识别（NER）的旨在仅基于几个标签实例来识别指定实体。当前的几次NER方法着重于利用富源域中的现有数据集，这些域可能会在未使用源域数据的训练中失败。为了解决训练环境，充分利用注释信息（边界和实体类型）至关重要。因此，在本文中，我们提出了一个新颖的多任务（种子，扩展和涉足）学习框架，请参见few，不使用源域数据，以获取很少的射击。播种和扩展模块负责提供尽可能准确的候选范围。需要模块将跨度分类重新定义为文本需要任务，从而利用上下文线索和实体类型信息。所有这三个模块共享相同的文本编码器，并共同学习。在训练之后的四个基准数据集上的实验结果表明，所提出的方法优于最先进的几弹性NER方法，其边距很大。我们的代码可在https://github.com/unveiled-the-red-hat/see-few上找到。

Few-shot named entity recognition (NER) aims at identifying named entities based on only few labeled instances. Current few-shot NER methods focus on leveraging existing datasets in the rich-resource domains which might fail in a training-from-scratch setting where no source-domain data is used. To tackle training-from-scratch setting, it is crucial to make full use of the annotation information (the boundaries and entity types). Therefore, in this paper, we propose a novel multi-task (Seed, Expand and Entail) learning framework, SEE-Few, for Few-shot NER without using source domain data. The seeding and expanding modules are responsible for providing as accurate candidate spans as possible for the entailing module. The entailing module reformulates span classification as a textual entailment task, leveraging both the contextual clues and entity type information. All the three modules share the same text encoder and are jointly learned. Experimental results on four benchmark datasets under the training-from-scratch setting show that the proposed method outperformed state-of-the-art few-shot NER methods with a large margin. Our code is available at https://github.com/unveiled-the-red-hat/SEE-Few.

下载PDF全文

下载文献需遵守相关版权规定

论文标题