蛋糕：通过渠道自动内核缩小为高效的3D网络

论文标题

蛋糕：通过渠道自动内核缩小为高效的3D网络

CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks

论文作者

Yu, Qihang, Li, Yingwei, Mei, Jieru, Zhou, Yuyin, Yuille, Alan L.

论文摘要

3D卷积神经网络（CNN）已广泛应用于3D场景的理解，例如视频分析和体积图像识别。但是，3D网络很容易导致过度参数化，从而产生了昂贵的计算成本。在本文中，我们建议通过渠道自动内核收缩（蛋糕），以通过将标准3D卷积缩小为一组经济运营来实现有效的3D学习，例如1D，2D卷积。与以前的方法不同，蛋糕可以执行频道的内核收缩，从而享有以下好处：1）使各层中部署的操作都具有异质性，以便它们可以提取多样化和互补的信息以使学习过程受益； 2）允许有效且灵活的替代设计，可以将其推广到时空数据和体积数据。此外，我们建议基于蛋糕的新搜索空间，以便可以自动确定替换配置，以简化3D网络。蛋糕比其他模型大小相似的方法显示出卓越的性能，并且在包括3D医学成像细分和视频动作识别的任务上，参数和计算成本较少的最新性能也可比较。代码和模型可在https://github.com/yucornetto/caks上找到

3D Convolution Neural Networks (CNNs) have been widely applied to 3D scene understanding, such as video analysis and volumetric image recognition. However, 3D networks can easily lead to over-parameterization which incurs expensive computation cost. In this paper, we propose Channel-wise Automatic KErnel Shrinking (CAKES), to enable efficient 3D learning by shrinking standard 3D convolutions into a set of economic operations e.g., 1D, 2D convolutions. Unlike previous methods, CAKES performs channel-wise kernel shrinkage, which enjoys the following benefits: 1) enabling operations deployed in every layer to be heterogeneous, so that they can extract diverse and complementary information to benefit the learning process; and 2) allowing for an efficient and flexible replacement design, which can be generalized to both spatial-temporal and volumetric data. Further, we propose a new search space based on CAKES, so that the replacement configuration can be determined automatically for simplifying 3D networks. CAKES shows superior performance to other methods with similar model size, and it also achieves comparable performance to state-of-the-art with much fewer parameters and computational costs on tasks including 3D medical imaging segmentation and video action recognition. Codes and models are available at https://github.com/yucornetto/CAKES

下载PDF全文

下载文献需遵守相关版权规定

论文标题