重新思考复发性神经网络和图像分类的其他改进

论文标题

重新思考复发性神经网络和图像分类的其他改进

Rethinking Recurrent Neural Networks and Other Improvements for Image Classification

论文作者

Phong, Nguyen Huu, Ribeiro, Bernardete

论文摘要

在可以追溯到几十年的机器学习的悠久历史上，复发性神经网络（RNN）主要用于顺序数据和时间序列，通常使用1D信息。即使在对2D图像的一些罕见研究中，这些网络也仅用于依次学习和生成数据，而不是用于图像识别任务。在这项研究中，我们建议在设计图像识别模型时将RNN作为附加层集成。我们还开发了端到端的多模型集合，以使用多种模型产生专家预测。此外，我们扩展了训练策略，以便我们的模型可与领先的模型相当，甚至可以在几个具有挑战性的数据集（例如SVHN（0.99），CIFAR-100（0.9027）和CIFAR-10和CIFAR-10（0.9852）上匹配最先进的模型。此外，我们的模型在萨里数据集（0.949）上设置了新记录。本文提供的方法的源代码可在https://github.com/leonlha/e2e-3m和http://nguyenhuuphong.me上获得。

Over the long history of machine learning, which dates back several decades, recurrent neural networks (RNNs) have been used mainly for sequential data and time series and generally with 1D information. Even in some rare studies on 2D images, these networks are used merely to learn and generate data sequentially rather than for image recognition tasks. In this study, we propose integrating an RNN as an additional layer when designing image recognition models. We also develop end-to-end multimodel ensembles that produce expert predictions using several models. In addition, we extend the training strategy so that our model performs comparably to leading models and can even match the state-of-the-art models on several challenging datasets (e.g., SVHN (0.99), Cifar-100 (0.9027) and Cifar-10 (0.9852)). Moreover, our model sets a new record on the Surrey dataset (0.949). The source code of the methods provided in this article is available at https://github.com/leonlha/e2e-3m and http://nguyenhuuphong.me.

下载PDF全文

下载文献需遵守相关版权规定

论文标题