替代辅助粒子群优化用于发展可变长度转移块以进行图像分类

论文标题

替代辅助粒子群优化用于发展可变长度转移块以进行图像分类

Surrogate-assisted Particle Swarm Optimisation for Evolving Variable-length Transferable Blocks for Image Classification

论文作者

Wang, Bin, Xue, Bing, Zhang, Mengjie

论文摘要

深度卷积神经网络已经显示出在图像分类任务上表现出令人鼓舞的性能，但是由于快速深度增长和卷积神经网络的日益复杂的拓扑，手动设计过程变得越来越复杂。结果，已经出现了神经体系结构搜索以自动设计超过手工制作的卷积神经网络。但是，计算成本是巨大的，例如两家出色的神经架构搜索作品NAS和NASNET分别为22,400个GPU日和2,000个GPU日，分别激发了这项工作。提出了一种新的有效且有效的替代粒子群群优化算法来自动发展卷积神经网络。这是通过提出一种新型的替代模型，一种创建替代数据集的新方法来实现的，以及一个新的编码策略来编码卷积神经网络的可变长度块，所有这些都集成到粒子群优化算法中以形成所提出的方法。提出的方法通过在CIFAR-10数据集上实现3.49％的竞争错误率，在CIFAR-100数据集中达到18.49％，在SVHN数据集上显示了1.82％。由于替代模型和替代数据集的加速，从CIFAR-10中提出的方法从CIFAR-10中提出的方法有效地学习了卷积神经网络块，以避免训练由粒子代表的卷积神经网络块的80.1％。没有进一步的搜索，可以将来自CIFAR-10的进化块成功转移到CIFAR-100和SVHN，后者表现出通过建议的方法学到的块的可传递性。

Deep convolutional neural networks have demonstrated promising performance on image classification tasks, but the manual design process becomes more and more complex due to the fast depth growth and the increasingly complex topologies of convolutional neural networks. As a result, neural architecture search has emerged to automatically design convolutional neural networks that outperform handcrafted counterparts. However, the computational cost is immense, e.g. 22,400 GPU-days and 2,000 GPU-days for two outstanding neural architecture search works named NAS and NASNet, respectively, which motivates this work. A new effective and efficient surrogate-assisted particle swarm optimisation algorithm is proposed to automatically evolve convolutional neural networks. This is achieved by proposing a novel surrogate model, a new method of creating a surrogate dataset and a new encoding strategy to encode variable-length blocks of convolutional neural networks, all of which are integrated into a particle swarm optimisation algorithm to form the proposed method. The proposed method shows its effectiveness by achieving competitive error rates of 3.49% on the CIFAR-10 dataset, 18.49% on the CIFAR-100 dataset, and 1.82% on the SVHN dataset. The convolutional neural network blocks are efficiently learned by the proposed method from CIFAR-10 within 3 GPU-days due to the acceleration achieved by the surrogate model and the surrogate dataset to avoid the training of 80.1% of convolutional neural network blocks represented by the particles. Without any further search, the evolved blocks from CIFAR-10 can be successfully transferred to CIFAR-100 and SVHN, which exhibits the transferability of the block learned by the proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题