论文标题

NLI可以为低资源生物医学关系提取提供适当的间接监督吗?

Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction?

论文作者

Xu, Jiashu, Ma, Mingyu Derek, Chen, Muhao

论文摘要

生物医学关系提取(RE)的两个关键障碍是注释的稀缺性以及由于较低的注释覆盖范围而没有明确预定标签的实例的普遍性。现有的方法将生物医学RE视为一项多类分类任务,通常会导致低资源设置的概括,并且没有能力对未知案例进行选择性预测,但会从可见的关系中进行猜测,从而阻碍了这些方法的适用性。我们提出了NBR,该NBR通过间接监督将生物医学RE转换为自然语言推断。通过将关系转换为自然语言假设,NBR能够利用语义提示来减轻注释稀缺性。通过合并基于排名的损失,该损失隐含地校准了避免的实例,NBR学习了更清晰的决策边界,并指示在不确定的情况下弃权。对三个广泛使用的生物医学RE基准(即Chemprot,DDI和GAD)进行了广泛的实验,验证了NBR在全设置和低资源方案中的有效性。我们的分析表明,即使存在域间隙,间接监督也会使生物医学有益于生物医学,并且将NLI知识与生物医学知识相结合会带来最佳的性能增长。

Two key obstacles in biomedical relation extraction (RE) are the scarcity of annotations and the prevalence of instances without explicitly pre-defined labels due to low annotation coverage. Existing approaches, which treat biomedical RE as a multi-class classification task, often result in poor generalization in low-resource settings and do not have the ability to make selective prediction on unknown cases but give a guess from seen relations, hindering the applicability of those approaches. We present NBR, which converts biomedical RE as natural language inference formulation through indirect supervision. By converting relations to natural language hypotheses, NBR is capable of exploiting semantic cues to alleviate annotation scarcity. By incorporating a ranking-based loss that implicitly calibrates abstinent instances, NBR learns a clearer decision boundary and is instructed to abstain on uncertain instances. Extensive experiments on three widely-used biomedical RE benchmarks, namely ChemProt, DDI and GAD, verify the effectiveness of NBR in both full-set and low-resource regimes. Our analysis demonstrates that indirect supervision benefits biomedical RE even when a domain gap exists, and combining NLI knowledge with biomedical knowledge leads to the best performance gains.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源