论文标题
雪橇:COVID-19的简单而有效的基线科学知识搜索
SLEDGE: A Simple Yet Effective Baseline for COVID-19 Scientific Knowledge Search
论文作者
论文摘要
由于围绕严重急性呼吸综合症2(SARS-COV-2)的全球关注,因此有关该病毒的文献迅速增长。临床医生,研究人员和决策者需要一种有效搜索这些文章的方法。在这项工作中,我们提出了一个名为Sledge的搜索系统,该系统利用Scibert有效地重新排列文章。我们在通用域答案排名数据集上训练该模型,并将相关性信号传输到SARS-COV-2进行评估。我们将Sledge的有效性视为TREC-COVID挑战的强大基线(用NDCG@10@10 of 0.6844在学习板上)。详细分析提供的见解提供了一些潜在的未来探索方向,包括按日期过滤的重要性以及更依赖计数信号的神经方法的潜力。我们发布代码,以促进https://github.com/georgetown-ir-lab/covid-neural-ir,以促进对这项关键任务的未来工作
With worldwide concerns surrounding the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), there is a rapidly growing body of literature on the virus. Clinicians, researchers, and policy-makers need a way to effectively search these articles. In this work, we present a search system called SLEDGE, which utilizes SciBERT to effectively re-rank articles. We train the model on a general-domain answer ranking dataset, and transfer the relevance signals to SARS-CoV-2 for evaluation. We observe SLEDGE's effectiveness as a strong baseline on the TREC-COVID challenge (topping the learderboard with an nDCG@10 of 0.6844). Insights provided by a detailed analysis provide some potential future directions to explore, including the importance of filtering by date and the potential of neural methods that rely more heavily on count signals. We release the code to facilitate future work on this critical task at https://github.com/Georgetown-IR-Lab/covid-neural-ir