论文标题

TRESNET:高性能GPU专用体系结构

TResNet: High Performance GPU-Dedicated Architecture

论文作者

Ridnik, Tal, Lawen, Hussam, Noy, Asaf, Baruch, Emanuel Ben, Sharir, Gilad, Friedman, Itamar

论文摘要

近年来开发的许多深度学习模型比RESNET50达到的Imagenet精度更高,而RESNET50的数量较少或可比较的拖鞋数量。虽然Flops通常被视为网络效率的代理,但是在测量实际GPU培训和推理吞吐量时,Vanilla Resnet50通常比最近的竞争对手要快得多,提供更好的吞吐量 - 准确性权衡。 在这项工作中,我们介绍了一系列架构修改,旨在提高神经网络的准确性,同时保留其GPU培训和推理效率。我们首先证明并讨论了拖鞋诱导的瓶颈。然后,我们建议更好地利用GPU结构和资产的替代设计。最后,我们介绍了一个新的GPU专用模型,称为TRESNET,该模型比以前的Convnet具有更好的准确性和效率。 使用具有与RESNET50相似的GPU吞吐量的TRESNET模型,我们在Imagenet上达到了80.8 TOP-1的精度。我们的TRESNET模型还可以很好地转移并在竞争性的单标签分类数据集(例如斯坦福汽车(96.0%),CIFAR-10(99.0%),CIFAR-100(91.5%)(91.5%)和牛津 - 流量(99.1%)(99.1%)上,在竞争性的单标签分类数据集上达到了最先进的精度。它们在多标签分类和对象检测任务上也表现良好。实现可在以下网址获得:https://github.com/mrt23/tresnet。

Many deep learning models, developed in recent years, reach higher ImageNet accuracy than ResNet50, with fewer or comparable FLOPS count. While FLOPs are often seen as a proxy for network efficiency, when measuring actual GPU training and inference throughput, vanilla ResNet50 is usually significantly faster than its recent competitors, offering better throughput-accuracy trade-off. In this work, we introduce a series of architecture modifications that aim to boost neural networks' accuracy, while retaining their GPU training and inference efficiency. We first demonstrate and discuss the bottlenecks induced by FLOPs-optimizations. We then suggest alternative designs that better utilize GPU structure and assets. Finally, we introduce a new family of GPU-dedicated models, called TResNet, which achieve better accuracy and efficiency than previous ConvNets. Using a TResNet model, with similar GPU throughput to ResNet50, we reach 80.8 top-1 accuracy on ImageNet. Our TResNet models also transfer well and achieve state-of-the-art accuracy on competitive single-label classification datasets such as Stanford cars (96.0%), CIFAR-10 (99.0%), CIFAR-100 (91.5%) and Oxford-Flowers (99.1%). They also perform well on multi-label classification and object detection tasks. Implementation is available at: https://github.com/mrT23/TResNet.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源