论文标题
硬件有效的基于反卷积的GAN用于边缘计算
Hardware-Efficient Deconvolution-Based GAN for Edge Computing
论文作者
论文摘要
生成对抗网络(GAN)是基于学习的数据分布生成新数据示例的尖端算法。但是,其性能在计算和内存需求方面具有巨大的成本。在本文中,我们提出了一种使用可扩展的流媒体数据流架构实现的HW/SW共同设计方法,用于训练在FPGA上实施的量化反卷积GAN(QDCGAN),能够实现较高的吞吐量与资源利用权的权衡。开发的加速器基于有效的反卷积引擎,该发动机在基于GAN的边缘计算的缩放因子方面提供了高平行性。此外,分析了各种精确度,数据集和网络可伸缩性,以了解资源约束平台的低功率推断。最后,提供了一个端到端的开源框架,用于培训,实施,州空间探索,并使用Vivado高级合成Xilinx SOC-FPGA和与Jetson Nano进行比较测试。
Generative Adversarial Networks (GAN) are cutting-edge algorithms for generating new data samples based on the learned data distribution. However, its performance comes at a significant cost in terms of computation and memory requirements. In this paper, we proposed an HW/SW co-design approach for training quantized deconvolution GAN (QDCGAN) implemented on FPGA using a scalable streaming dataflow architecture capable of achieving higher throughput versus resource utilization trade-off. The developed accelerator is based on an efficient deconvolution engine that offers high parallelism with respect to scaling factors for GAN-based edge computing. Furthermore, various precisions, datasets, and network scalability were analyzed for low-power inference on resource-constrained platforms. Lastly, an end-to-end open-source framework is provided for training, implementation, state-space exploration, and scaling the inference using Vivado high-level synthesis for Xilinx SoC-FPGAs, and a comparison testbed with Jetson Nano.