硬件有效的基于反卷积的GAN用于边缘计算

论文标题

硬件有效的基于反卷积的GAN用于边缘计算

Hardware-Efficient Deconvolution-Based GAN for Edge Computing

论文作者

Alhussain, Azzam, Lin, Mingjie

论文摘要

生成对抗网络（GAN）是基于学习的数据分布生成新数据示例的尖端算法。但是，其性能在计算和内存需求方面具有巨大的成本。在本文中，我们提出了一种使用可扩展的流媒体数据流架构实现的HW/SW共同设计方法，用于训练在FPGA上实施的量化反卷积GAN（QDCGAN），能够实现较高的吞吐量与资源利用权的权衡。开发的加速器基于有效的反卷积引擎，该发动机在基于GAN的边缘计算的缩放因子方面提供了高平行性。此外，分析了各种精确度，数据集和网络可伸缩性，以了解资源约束平台的低功率推断。最后，提供了一个端到端的开源框架，用于培训，实施，州空间探索，并使用Vivado高级合成Xilinx SOC-FPGA和与Jetson Nano进行比较测试。

Generative Adversarial Networks (GAN) are cutting-edge algorithms for generating new data samples based on the learned data distribution. However, its performance comes at a significant cost in terms of computation and memory requirements. In this paper, we proposed an HW/SW co-design approach for training quantized deconvolution GAN (QDCGAN) implemented on FPGA using a scalable streaming dataflow architecture capable of achieving higher throughput versus resource utilization trade-off. The developed accelerator is based on an efficient deconvolution engine that offers high parallelism with respect to scaling factors for GAN-based edge computing. Furthermore, various precisions, datasets, and network scalability were analyzed for low-power inference on resource-constrained platforms. Lastly, an end-to-end open-source framework is provided for training, implementation, state-space exploration, and scaling the inference using Vivado high-level synthesis for Xilinx SoC-FPGAs, and a comparison testbed with Jetson Nano.

下载PDF全文

下载文献需遵守相关版权规定

论文标题