通过涂鸦注释弱监督的显着对象检测

论文标题

通过涂鸦注释弱监督的显着对象检测

Weakly-Supervised Salient Object Detection via Scribble Annotations

论文作者

Zhang, Jing, Yu, Xin, Li, Aixuan, Song, Peipei, Liu, Bowen, Dai, Yuchao

论文摘要

与辛苦的像素密度标签相比，用涂鸦标记数据要容易得多，这仅需1 $ \ sim $ 2 $ 2秒即可标记一张图像。但是，尚未探索使用涂鸦标签来学习显着对象检测。在本文中，我们提出了一个弱监督的显着对象检测模型，以从此类注释中学习显着性。为此，我们首先用涂鸦（即S-DUTS数据集）重新标记现有的大规模显着对象检测数据集。由于涂鸦无法确定对象结构和详细信息，因此直接使用涂鸦标签的训练将导致边界定位差的显着图。为了减轻此问题，我们提出了一个辅助边缘检测任务，以显式地本地化对象边缘，并提出一个封闭式结构感知的损失，以将约束放在要恢复的结构范围上。此外，我们设计了一个涂鸦的增强方案，以迭代地巩固我们的涂鸦注释，然后将其作为监督来学习高质量的显着性图。由于现有的显着性评估指标忽略了衡量预测的结构对齐的忽略，因此显着性图的排名可能不符合人类的感知。我们提出了一种新的指标，称为显着性结构度量，以测量预测显着图的结构对齐，这与人类的感知更一致。六个基准数据集的广泛实验表明，我们的方法不仅胜过现有的弱监督/无监督方法，而且还与几种完全监督的最新模型相当。我们的代码和数据可在https://github.com/jingzhang617/scribble_sality上公开获取。

Compared with laborious pixel-wise dense labeling, it is much easier to label data by scribbles, which only costs 1$\sim$2 seconds to label one image. However, using scribble labels to learn salient object detection has not been explored. In this paper, we propose a weakly-supervised salient object detection model to learn saliency from such annotations. In doing so, we first relabel an existing large-scale salient object detection dataset with scribbles, namely S-DUTS dataset. Since object structure and detail information is not identified by scribbles, directly training with scribble labels will lead to saliency maps of poor boundary localization. To mitigate this problem, we propose an auxiliary edge detection task to localize object edges explicitly, and a gated structure-aware loss to place constraints on the scope of structure to be recovered. Moreover, we design a scribble boosting scheme to iteratively consolidate our scribble annotations, which are then employed as supervision to learn high-quality saliency maps. As existing saliency evaluation metrics neglect to measure structure alignment of the predictions, the saliency map ranking metric may not comply with human perception. We present a new metric, termed saliency structure measure, to measure the structure alignment of the predicted saliency maps, which is more consistent with human perception. Extensive experiments on six benchmark datasets demonstrate that our method not only outperforms existing weakly-supervised/unsupervised methods, but also is on par with several fully-supervised state-of-the-art models. Our code and data is publicly available at https://github.com/JingZhang617/Scribble_Saliency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题