论文标题
你为什么不与之相比?识别用作基准的论文
Why Did You Not Compare With That? Identifying Papers for Use as Baselines
论文作者
论文摘要
我们建议在科学文章中自动识别用作基线的论文的任务。我们将问题视为二进制分类任务,在该任务中,论文中的所有参考都应归类为基准或非基线。这是一个具有挑战性的问题,因为可以在论文中出现基线参考的多种方式。我们从ACL选集语料库中开发了一个$ 2,075 $纸的数据集,其所有参考文献手动注释为两个类别之一。我们为基线分类任务开发了一个基于多模块注意力的神经分类器,当应用于基线分类任务时,它的表现优于四种最先进的引文角色分类方法。我们还对拟议分类器犯的错误进行了分析,这引起了使基线识别成为具有挑战性的问题的挑战。
We propose the task of automatically identifying papers used as baselines in a scientific article. We frame the problem as a binary classification task where all the references in a paper are to be classified as either baselines or non-baselines. This is a challenging problem due to the numerous ways in which a baseline reference can appear in a paper. We develop a dataset of $2,075$ papers from ACL anthology corpus with all their references manually annotated as one of the two classes. We develop a multi-module attention-based neural classifier for the baseline classification task that outperforms four state-of-the-art citation role classification methods when applied to the baseline classification task. We also present an analysis of the errors made by the proposed classifier, eliciting the challenges that make baseline identification a challenging problem.