论文标题

用于burrows-wheeler的内置计算的硬件体系结构单个迭代中的变换

Hardware Architecture for Inplace Compute of Burrows-Wheeler Transform in a Single Iteration

论文作者

Stangherlin, Kleber, Kennings, Andrew

论文摘要

BZIP2压缩机家族使用了Burrows-wheeler变换(BWT)。在本文中,我们提出了一个硬件体系结构,该硬件架构实现了内置算法来计算BWT。我们的设计没有后缀阵列或输出数组的明确存储。我们实现的性能是固定的,不取决于输入字符串内容。我们在Scanchain配置中使用基于寄存器的字符缓冲区,以便在加载字符时从右到左计算BWT。每六个周期每六个周期进行加载新字符,以相同的速率从先前计算的块中产生一个新的输出字符。我们的FGPA实现不使用Block RAM实例,并且对于128 B,1 KB,4 KB,4 KB和8 KB的块尺寸达到66、35、18和15 MB/S的吞吐量。我们还报告了在65 nm CMO中实现ASIC实现的结果,当使用128 B的块大小。

The Burrows-Wheeler transform (BWT) is used by the bzip2 family of compressors. In this paper, we present a hardware architecture that implements an inplace algorithm to compute the BWT. Our design does not have explicit storage for the suffix array, or output array. The performance of our implementation is fixed, and does not depend on the input string content. We use a register based character buffer in a scanchain configuration, such that the BWT is computed from right to left, as characters are loaded. Loading new characters is done every six cycles, producing a new output character from the previously computed block at the same rate. Our FGPA implementation does not use block ram instances, and achieves throughput of 66, 35, 18, and 15 MB/s for block sizes of 128 B, 1 kB, 4 kB, and 8 kB. We also report results for an ASIC implementation in 65 nm CMOS that achieves 161 MB/s when using block size of 128 B.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源