利用形状提示，用于弱监督的语义细分

论文标题

利用形状提示，用于弱监督的语义细分

Exploiting Shape Cues for Weakly Supervised Semantic Segmentation

论文作者

Kho, Sungpil, Lee, Pilhyeon, Lee, Wonyoung, Ki, Minsong, Byun, Hyeran

论文摘要

弱监督的语义细分（WSSS）旨在仅使用用于训练的图像级标签来产生像素类预测。为此，以前的方法采用了通用管道：它们从类激活图（CAM）生成伪口罩，并使用此类掩码来监督分割网络。但是，由于凸轮的局部属性，即它们倾向于仅专注于小的判别对象零件，因此涵盖涵盖整个物体的全部范围的全面伪面罩是一项挑战。在本文中，我们将CAM的局部性与卷积神经网络（CNNS）的质地偏见特性相关联。因此，我们建议利用形状信息来补充质地偏见的CNN特征，从而鼓励掩模预测不仅是全面的，而且还与物体边界相交。我们通过一种新颖的改进方法进一步完善了在线方式的预测，该方法同时考虑了类和颜色亲和力，以生成可靠的伪口罩以监督模型。重要的是，我们的模型是在单级框架内端到端训练的，因此在培训成本方面有效。通过对Pascal VOC 2012的广泛实验，我们验证了方法在产生精确和形状对准的分割结果方面的有效性。具体而言，我们的模型超过了大幅度的现有最新单阶段方法。此外，当在没有铃铛和哨声的简单两阶段管道中采用时，它还在多阶段方法上实现了新的最新性能。

Weakly supervised semantic segmentation (WSSS) aims to produce pixel-wise class predictions with only image-level labels for training. To this end, previous methods adopt the common pipeline: they generate pseudo masks from class activation maps (CAMs) and use such masks to supervise segmentation networks. However, it is challenging to derive comprehensive pseudo masks that cover the whole extent of objects due to the local property of CAMs, i.e., they tend to focus solely on small discriminative object parts. In this paper, we associate the locality of CAMs with the texture-biased property of convolutional neural networks (CNNs). Accordingly, we propose to exploit shape information to supplement the texture-biased CNN features, thereby encouraging mask predictions to be not only comprehensive but also well-aligned with object boundaries. We further refine the predictions in an online fashion with a novel refinement method that takes into account both the class and the color affinities, in order to generate reliable pseudo masks to supervise the model. Importantly, our model is end-to-end trained within a single-stage framework and therefore efficient in terms of the training cost. Through extensive experiments on PASCAL VOC 2012, we validate the effectiveness of our method in producing precise and shape-aligned segmentation results. Specifically, our model surpasses the existing state-of-the-art single-stage approaches by large margins. What is more, it also achieves a new state-of-the-art performance over multi-stage approaches, when adopted in a simple two-stage pipeline without bells and whistles.

下载PDF全文

下载文献需遵守相关版权规定

论文标题