论文标题
仅整数离散流量的快速无损神经压缩
Fast Lossless Neural Compression with Integer-Only Discrete Flows
论文作者
论文摘要
通过将熵编解码器与学习的数据分布应用,神经压缩机在压缩比方面显着优于传统的编解码器。但是,神经网络的高推断潜伏期阻碍了实际应用中神经压缩机的部署。在这项工作中,我们提出了仅整数离散流(IODF),这是一种具有仅整数算术的有效神经压缩机。我们的工作建立在整数离散流的基础上,该流程包括离散随机变量之间的可逆转换。我们提出了基于8位量化的纯整数算术的有效可逆转换。我们可逆转换配备了可学习的二进制门,以在推理过程中删除冗余过滤器。与现有的神经压缩机相比,我们在GPU上使用Tensorrt部署iodf,在GPU上实现了10倍推理的速度,同时保留了Imagenet32和Imagenet64上的高压缩率。
By applying entropy codecs with learned data distributions, neural compressors have significantly outperformed traditional codecs in terms of compression ratio. However, the high inference latency of neural networks hinders the deployment of neural compressors in practical applications. In this work, we propose Integer-only Discrete Flows (IODF), an efficient neural compressor with integer-only arithmetic. Our work is built upon integer discrete flows, which consists of invertible transformations between discrete random variables. We propose efficient invertible transformations with integer-only arithmetic based on 8-bit quantization. Our invertible transformation is equipped with learnable binary gates to remove redundant filters during inference. We deploy IODF with TensorRT on GPUs, achieving 10x inference speedup compared to the fastest existing neural compressors, while retaining the high compression rates on ImageNet32 and ImageNet64.