对抗掩盖自我监督学习

论文标题

对抗掩盖自我监督学习

Adversarial Masking for Self-Supervised Learning

论文作者

Shi, Yuge, Siddharth, N., Torr, Philip H. S., Kosiorek, Adam R.

论文摘要

我们提出了Adios，这是一个自我监督学习的遮罩图像模型（MIM）框架，该框架同时使用对抗性目标学习了掩盖功能和图像编码器。对图像编码器进行了训练，以最大程度地减少原始图像的表示形式与蒙版图像的表示之间的距离。相反，掩蔽函数旨在最大化此距离。阿迪奥斯（Adios）始终改进有关各种任务和数据集的最先进的自我监督学习（SSL）方法，包括对Imagenet100和STL10的分类，CIFAR10/100上的转移学习，Flowers102和Inaturalist上的转移学习，以及对背景挑战（Xiao等人的鲁棒性）的鲁棒性（Xiao等人，20211），同时有意义。与MAE，BEIT和IBOT等现代MIM模型不同，Adios不依赖视觉变压器的图像斑点令牌结构，并且可以用卷积的骨架来实现。我们进一步证明，与对流行MIM模型中使用的掩盖方案相比，Adios学到的面具在改善SSL方法的表示方面更有效。代码可从https://github.com/yugeten/adios获得。

We propose ADIOS, a masked image model (MIM) framework for self-supervised learning, which simultaneously learns a masking function and an image encoder using an adversarial objective. The image encoder is trained to minimise the distance between representations of the original and that of a masked image. The masking function, conversely, aims at maximising this distance. ADIOS consistently improves on state-of-the-art self-supervised learning (SSL) methods on a variety of tasks and datasets -- including classification on ImageNet100 and STL10, transfer learning on CIFAR10/100, Flowers102 and iNaturalist, as well as robustness evaluated on the backgrounds challenge (Xiao et al., 2021) -- while generating semantically meaningful masks. Unlike modern MIM models such as MAE, BEiT and iBOT, ADIOS does not rely on the image-patch tokenisation construction of Vision Transformers, and can be implemented with convolutional backbones. We further demonstrate that the masks learned by ADIOS are more effective in improving representation learning of SSL methods than masking schemes used in popular MIM models. Code is available at https://github.com/YugeTen/adios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题