使用基于FFT的分裂卷积加速卷积神经网络

论文标题

使用基于FFT的分裂卷积加速卷积神经网络

Acceleration of Convolutional Neural Network Using FFT-Based Split Convolutions

论文作者

Chitsaz, Kamran, Hajabdollahi, Mohsen, Karimi, Nader, Samavi, Shadrokh, Shirani, Shahram

论文摘要

卷积神经网络（CNN）具有大量变量，因此其实施遇到了复杂的问题。已经开发出不同的方法和技术来减轻CNN复杂性问题，例如量化，修剪等。在不同的简化方法中，傅立叶域中的计算被视为CNN加速的新范式。关于基于快速傅立叶变换（FFT）的CNN的最新研究旨在简化FFT所需的计算。但是，有很大的空间来降低FFT的计算复杂性。在本文中，提出了一种基于输入分割的FFT域中CNN处理的新方法。在CNN等情况下，使用小内核计算FFT存在问题。分裂可以被视为针对小内核引起的此类问题的有效解决方案。使用分裂冗余（例如重叠和添加）会降低，并提高效率。执行提出的FFT方法的硬件实现以及复杂性的不同分析，以证明所提出的方法的适当性能。

Convolutional neural networks (CNNs) have a large number of variables and hence suffer from a complexity problem for their implementation. Different methods and techniques have developed to alleviate the problem of CNN's complexity, such as quantization, pruning, etc. Among the different simplification methods, computation in the Fourier domain is regarded as a new paradigm for the acceleration of CNNs. Recent studies on Fast Fourier Transform (FFT) based CNN aiming at simplifying the computations required for FFT. However, there is a lot of space for working on the reduction of the computational complexity of FFT. In this paper, a new method for CNN processing in the FFT domain is proposed, which is based on input splitting. There are problems in the computation of FFT using small kernels in situations such as CNN. Splitting can be considered as an effective solution for such issues aroused by small kernels. Using splitting redundancy, such as overlap-and-add, is reduced and, efficiency is increased. Hardware implementation of the proposed FFT method, as well as different analyses of the complexity, are performed to demonstrate the proper performance of the proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题