论文标题

开放域问题回答的内存有效基线

A Memory Efficient Baseline for Open Domain Question Answering

论文作者

Izacard, Gautier, Petroni, Fabio, Hosseini, Lucas, De Cao, Nicola, Riedel, Sebastian, Grave, Edouard

论文摘要

最近,基于密集表示的检索系统已导致开放域问答和相关任务的重要改进。尽管非常有效,但这种方法也是记忆密集的,因为整个知识源的密集矢量需要保持在记忆中。在本文中,我们研究了如何减少密集的猎犬阅读器系统的记忆足迹。我们考虑了减少指数大小的三种策略:降低尺寸,矢量量化和通过过滤。我们在两个问题回答基准的问题上评估了我们的方法:triviaqa和ashortequestions,表明可以使用少于6GB的内存获得竞争系统。

Recently, retrieval systems based on dense representations have led to important improvements in open-domain question answering, and related tasks. While very effective, this approach is also memory intensive, as the dense vectors for the whole knowledge source need to be kept in memory. In this paper, we study how the memory footprint of dense retriever-reader systems can be reduced. We consider three strategies to reduce the index size: dimension reduction, vector quantization and passage filtering. We evaluate our approach on two question answering benchmarks: TriviaQA and NaturalQuestions, showing that it is possible to get competitive systems using less than 6Gb of memory.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源