通过掩码聚合进行的几个射击语义分割

论文标题

通过掩码聚合进行的几个射击语义分割

Few-shot semantic segmentation via mask aggregation

论文作者

Ao, Wei, Zheng, Shunyi, Meng, Yan

论文摘要

很少有语义分割旨在仅使用很少的标记数据来识别新颖的类别。这项具有挑战性的任务需要挖掘查询图像与支持图像之间的相关关系。以前的工作通常将其视为像素分类问题。因此，已经设计了各种模型来探索查询图像与支持图像之间像素的相关性。但是，它们仅关注像素的对应关系，而忽略对象的整体相关性。在本文中，我们介绍了一种基于面具的分类方法来解决此问题。提议同时生成固定数量的掩码及其成为目标的概率，这是一个简单的掩码分类模型，它是一个简单的掩码分类模型。然后，通过根据其位置汇总所有掩码来获得最终的分割结果。对Pascal-5^i和Coco-20^i数据集的实验表明，我们的方法与基于最新的像素的方法相当地执行。这种竞争性的表现表明，在几次射击语义分段中，蒙版分类作为替代基线方法的潜力。我们的源代码将在https://github.com/tinyaway/manet上提供。

Few-shot semantic segmentation aims to recognize novel classes with only very few labelled data. This challenging task requires mining of the relevant relationships between the query image and the support images. Previous works have typically regarded it as a pixel-wise classification problem. Therefore, various models have been designed to explore the correlation of pixels between the query image and the support images. However, they focus only on pixel-wise correspondence and ignore the overall correlation of objects. In this paper, we introduce a mask-based classification method for addressing this problem. The mask aggregation network (MANet), which is a simple mask classification model, is proposed to simultaneously generate a fixed number of masks and their probabilities of being targets. Then, the final segmentation result is obtained by aggregating all the masks according to their locations. Experiments on both the PASCAL-5^i and COCO-20^i datasets show that our method performs comparably to the state-of-the-art pixel-based methods. This competitive performance demonstrates the potential of mask classification as an alternative baseline method in few-shot semantic segmentation. Our source code will be made available at https://github.com/TinyAway/MANet.

下载PDF全文

下载文献需遵守相关版权规定

论文标题