联合物：重新思考图像混合物，用于无监督的视觉表示学习

论文标题

联合物：重新思考图像混合物，用于无监督的视觉表示学习

Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning

论文作者

Shen, Zhiqiang, Liu, Zechun, Liu, Zhuang, Savvides, Marios, Darrell, Trevor, Xing, Eric

论文摘要

最近高级无监督的学习方法使用类似暹罗的框架来比较同一图像中的两个“视图”以进行学习表示。使两个观点与众不同是保证无监督方法可以学习有意义的信息的核心。但是，如果用于产生两个视图的增强功能不够强，这会导致培训数据上的过度自信问题，则这些框架有时会在过度拟合时脆弱。该缺点阻碍了模型无法学习细微的差异和细粒度的信息。为了解决这一问题，在这项工作中，我们旨在将标签空间上的距离概念涉及无监督的学习中，并让模型通过混合输入数据空间来了解正面或负面对之间的相似性，以便在输入和损失空间协同工作。尽管它具有概念性的简单性，但我们从经验上表明，通过解决方案 - 无监督的图像混合物（UN-MIX），我们可以从转换的输入和相应的新标签空间中学习微妙，更健壮和更广泛的表示。在CIFAR-10，CIFAR-100，STL-10，Tiny Imagenet和标准成像网上进行了广泛的实验，并采用流行的无监督方法Simclr，Byol，Moco V1和SWAV，SWAV等进行。我们提出的图像混合物和标签策略可以将一致的即兴进发术在同一超层次和培训过程中获得1 〜3％的基础方法。代码可在https://github.com/szq0214/un-mix上公开获取。

The recently advanced unsupervised learning approaches use the siamese-like framework to compare two "views" from the same image for learning representations. Making the two views distinctive is a core to guarantee that unsupervised methods can learn meaningful information. However, such frameworks are sometimes fragile on overfitting if the augmentations used for generating two views are not strong enough, causing the over-confident issue on the training data. This drawback hinders the model from learning subtle variance and fine-grained information. To address this, in this work we aim to involve the distance concept on label space in the unsupervised learning and let the model be aware of the soft degree of similarity between positive or negative pairs through mixing the input data space, to further work collaboratively for the input and loss spaces. Despite its conceptual simplicity, we show empirically that with the solution -- Unsupervised image mixtures (Un-Mix), we can learn subtler, more robust and generalized representations from the transformed input and corresponding new label space. Extensive experiments are conducted on CIFAR-10, CIFAR-100, STL-10, Tiny ImageNet and standard ImageNet with popular unsupervised methods SimCLR, BYOL, MoCo V1&V2, SwAV, etc. Our proposed image mixture and label assignment strategy can obtain consistent improvement by 1~3% following exactly the same hyperparameters and training procedures of the base methods. Code is publicly available at https://github.com/szq0214/Un-Mix.

下载PDF全文

下载文献需遵守相关版权规定

论文标题