论文标题
Sparta:通过稀疏变压器匹配检索的有效开放域问题回答
SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval
论文作者
论文摘要
我们介绍了Sparta,这是一种新颖的神经检索方法,在性能,概括和开放域问题回答方面显示出巨大的希望。与使用密集矢量最近邻居搜索的许多神经排名方法不同,斯巴达学会了一个稀疏的表示,可以有效地实现为倒立索引。由此产生的表示可以实现可扩展的神经检索,这不需要昂贵的近似矢量搜索,并且可以提高性能比其密集的对应物更好。我们验证了4个开放域问答(OPENQA)任务和11个检索问题答案(REQA)任务的方法。 Sparta在英语和中文数据集中的各种开放域问答任务中取得了新的最先进的结果,包括开放小队,Natuarl问题,CMRC和CMRC等。分析也证实了所提出的方法可以创建人类的可解释表示形式,并允许对性能和效率之间的权衡进行灵活的控制。
We introduce SPARTA, a novel neural retrieval method that shows great promise in performance, generalization, and interpretability for open-domain question answering. Unlike many neural ranking methods that use dense vector nearest neighbor search, SPARTA learns a sparse representation that can be efficiently implemented as an Inverted Index. The resulting representation enables scalable neural retrieval that does not require expensive approximate vector search and leads to better performance than its dense counterpart. We validated our approaches on 4 open-domain question answering (OpenQA) tasks and 11 retrieval question answering (ReQA) tasks. SPARTA achieves new state-of-the-art results across a variety of open-domain question answering tasks in both English and Chinese datasets, including open SQuAD, Natuarl Question, CMRC and etc. Analysis also confirms that the proposed method creates human interpretable representation and allows flexible control over the trade-off between performance and efficiency.