论文标题

一个网络并不能全部统治它们:超越自学学习中的手工体系结构

One Network Doesn't Rule Them All: Moving Beyond Handcrafted Architectures in Self-Supervised Learning

论文作者

Girish, Sharath, Dey, Debadeepta, Joshi, Neel, Vineet, Vibhav, Shah, Shital, Mendes, Caio Cesar Teodoro, Shrivastava, Abhinav, Song, Yale

论文摘要

当前有关自我监督学习(SSL)的文献着重于开发学习目标,以更有效地训练神经网络在未标记的数据上。典型的开发过程涉及采用完善的体系结构,例如在Imagenet上展示的重新连接,并使用它们来评估下游场景上新发展的目标。尽管很方便,但这并未考虑到在监督学习文献中被证明至关重要的体系结构的作用。在这项工作中,我们建立了广泛的经验证据,表明网络架构在SSL中起着重要作用。我们进行了一项大规模研究,其中有100多种Resnet和Mobilenet架构,并在SSL环境中的11个下游场景中对其进行了评估。我们表明,没有一个网络在整个方案中都持续不断地表现。基于此,我们建议不仅学习网络权重,还要学习SSL制度中的体系结构拓扑。我们表明,“自我监管的体系结构”优于流行的手工架构(Resnet18和Mobilenetv2),同时在与大型图像分类基准(Imagenet-1K,Inat2021等)上与较大且计算重的重新分类竞争性竞争性竞争性。我们的结果表明,现在是时候考虑超越SSL中的手工架构,并开始考虑将架构搜索纳入自我监督的学习目标。

The current literature on self-supervised learning (SSL) focuses on developing learning objectives to train neural networks more effectively on unlabeled data. The typical development process involves taking well-established architectures, e.g., ResNet demonstrated on ImageNet, and using them to evaluate newly developed objectives on downstream scenarios. While convenient, this does not take into account the role of architectures which has been shown to be crucial in the supervised learning literature. In this work, we establish extensive empirical evidence showing that a network architecture plays a significant role in SSL. We conduct a large-scale study with over 100 variants of ResNet and MobileNet architectures and evaluate them across 11 downstream scenarios in the SSL setting. We show that there is no one network that performs consistently well across the scenarios. Based on this, we propose to learn not only network weights but also architecture topologies in the SSL regime. We show that "self-supervised architectures" outperform popular handcrafted architectures (ResNet18 and MobileNetV2) while performing competitively with the larger and computationally heavy ResNet50 on major image classification benchmarks (ImageNet-1K, iNat2021, and more). Our results suggest that it is time to consider moving beyond handcrafted architectures in SSL and start thinking about incorporating architecture search into self-supervised learning objectives.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源