论文标题

等级的工程紧凑数据结构和选择位向量的查询

Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors

论文作者

Kurpicz, Florian

论文摘要

位矢量是许多简洁数据结构的基本构建基块。它们可用于表示图形,是小波树形式的许多文本索引的重要组成部分,可用于将整数的有序序列编码为Elias-Fano代码。为此,必须回答两个查询:即排名和选择查询。给定位置向量的位置,等级查询在该位置之前返回1位的数量。给定参数$ j $的选择查询返回$ j $ -th 1位的位置。在长度为n位矢量上,两个查询都可以在$ o(1)$时间中回答,并且需要$ o(n)$额外的空间。实际上,最小(未压缩的)等级和选择数据结构CS-Poppy的空间为$ \ $ \ $ 3.51%[Zhou等,Sea 13]。在本文中,我们提出了一个改进的等级和选择的数据结构,该数据结构的空间相同,但与CS-Poppy相比,可以回答高达8%(等级)和16.5%(选择)的查询。

Bit vectors are fundamental building blocks of many succinct data structures. They can be used to represent graphs, are an important part of many text indices in the form of the wavelet tree, and can be used to encode ordered sequences of integers as Elias-Fano codes. To do so, two queries have to be answered: namely rank and select queries. Given a position in the bit vector, a rank query returns the number of 1-bits before that position. A select query, given a parameter $j$, returns the position of the $j$-th 1-bit. On a length-n bit vector, both queries can be answered in $O(1)$ time and require $o(n)$ bits of additional space. In practice, the smallest (uncompressed) rank and select data structure cs-poppy has a space overhead of $\approx$ 3.51% [Zhou et al., SEA 13]. In this paper, we present an improved rank and select data structure that has the same space overhead but can answer queries up to 8% (rank) and 16.5% (select) faster compared with cs-poppy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源