生成的低含宽数据无数据量化

论文标题

生成的低含宽数据无数据量化

Generative Low-bitwidth Data Free Quantization

论文作者

Xu, Shoukai, Li, Haokun, Zhuang, Bohan, Liu, Jing, Cao, Jiezhang, Liang, Chuangrun, Tan, Mingkui

论文摘要

神经网络量化是压缩深层模型并提高其执行延迟和能源效率的有效方法，以便可以将其部署在移动设备或嵌入式设备上。现有的量化方法需要原始数据进行校准或微调以获得更好的性能。但是，在许多实际情况下，由于机密或私人问题，可能无法获得数据，从而使现有的量化方法不适用。此外，由于没有原始数据，最近开发的生成对抗网络（GAN）不能应用于生成数据。尽管完整的模型可能包含丰富的数据信息，但仅此信息就很难利用来恢复原始数据或生成新的有意义的数据。在本文中，我们研究了一种称为生成低宽度数据无量化量化（GDFQ）的简单有效的方法，以消除数据依赖性负担。具体而言，我们建议通过在预训练模型中利用分类边界知识和分布信息来产生有意义的假数据，以产生有意义的假数据。借助生成的数据，我们可以通过从预训练的模型中学习知识来量化模型。对三个数据集的广泛实验证明了我们方法的有效性。更重要的是，我们的方法在4位量化上的准确性比现有数据的免费量化方法更高。代码可在https://github.com/xushoukai/gdfq上找到。

Neural network quantization is an effective way to compress deep models and improve their execution latency and energy efficiency, so that they can be deployed on mobile or embedded devices. Existing quantization methods require original data for calibration or fine-tuning to get better performance. However, in many real-world scenarios, the data may not be available due to confidential or private issues, thereby making existing quantization methods not applicable. Moreover, due to the absence of original data, the recently developed generative adversarial networks (GANs) cannot be applied to generate data. Although the full-precision model may contain rich data information, such information alone is hard to exploit for recovering the original data or generating new meaningful data. In this paper, we investigate a simple-yet-effective method called Generative Low-bitwidth Data Free Quantization (GDFQ) to remove the data dependence burden. Specifically, we propose a knowledge matching generator to produce meaningful fake data by exploiting classification boundary knowledge and distribution information in the pre-trained model. With the help of generated data, we can quantize a model by learning knowledge from the pre-trained model. Extensive experiments on three data sets demonstrate the effectiveness of our method. More critically, our method achieves much higher accuracy on 4-bit quantization than the existing data free quantization method. Code is available at https://github.com/xushoukai/GDFQ.

下载PDF全文

下载文献需遵守相关版权规定

论文标题