使用不可靠的伪标签的半监督语义分割

论文标题

使用不可靠的伪标签的半监督语义分割

Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

论文作者

Wang, Yuchao, Wang, Haochen, Shen, Yujun, Fei, Jingjing, Li, Wei, Jin, Guoqiang, Wu, Liwei, Zhao, Rui, Le, Xinyi

论文摘要

半监督语义分割的关键是将足够的伪标记分配给未标记的图像的像素。一种普遍的做法是选择高度自信的预测作为伪真相，但它导致一个问题，即由于它们的不可靠性，大多数像素可能未使用。我们认为，每个像素对模型培训都很重要，即使是其预测也是模棱两可的。从直觉上讲，不可靠的预测可能会在顶级阶级（即概率最高的人）之间感到困惑，但是，它应该对不属于其余类的像素充满信心。因此，这样的像素可以令人信服地视为最不可能类别的负面样本。基于这种见解，我们开发了有效的管道来充分利用未标记的数据。具体而言，我们通过预测的熵将可靠和不可靠的像素分开，将每个不可靠的像素推到由负样本组成的类别队列，并设法用所有候选像素来训练模型。考虑到训练的演变，预测变得越来越准确，我们可以自适应地调整可靠的无可靠分区的阈值。各种基准和培训环境的实验结果证明了我们的方法优于最先进的替代方案。

The crux of semi-supervised semantic segmentation is to assign adequate pseudo-labels to the pixels of unlabeled images. A common practice is to select the highly confident predictions as the pseudo ground-truth, but it leads to a problem that most pixels may be left unused due to their unreliability. We argue that every pixel matters to the model training, even its prediction is ambiguous. Intuitively, an unreliable prediction may get confused among the top classes (i.e., those with the highest probabilities), however, it should be confident about the pixel not belonging to the remaining classes. Hence, such a pixel can be convincingly treated as a negative sample to those most unlikely categories. Based on this insight, we develop an effective pipeline to make sufficient use of unlabeled data. Concretely, we separate reliable and unreliable pixels via the entropy of predictions, push each unreliable pixel to a category-wise queue that consists of negative samples, and manage to train the model with all candidate pixels. Considering the training evolution, where the prediction becomes more and more accurate, we adaptively adjust the threshold for the reliable-unreliable partition. Experimental results on various benchmarks and training settings demonstrate the superiority of our approach over the state-of-the-art alternatives.

下载PDF全文

下载文献需遵守相关版权规定

论文标题