几何限制了弱监督的对象本地化

论文标题

几何限制了弱监督的对象本地化

Geometry Constrained Weakly Supervised Object Localization

论文作者

Lu, Weizeng, Jia, Xi, Xie, Weicheng, Shen, Linlin, Zhou, Yicong, Duan, Jinming

论文摘要

我们提出了一个被称为GC-NET的几何限制网络，用于弱监督对象定位（WSOL）。 GC-NET由三个模块组成：检测器，一个生成器和分类器。该检测器预测由描述几何形状（即椭圆形或矩形）的一组系数定义的对象位置，该系数受到生成器产生的掩码的几何限制。分类器将结果蒙版的图像作为输入，并为对象和背景执行两个互补的分类任务。为了使蒙版更紧凑，更完整，我们提出了一种新型的多任务损失函数，该函数考虑了几何形状，分类跨透镜和负熵。与以前的方法相反，GC-NET是端对端训练的，并预测对象位置，而无需任何可能需要其他调整的后处理（例如阈值）。 CUB-200-2011和ILSVRC2012数据集的广泛实验表明，GC-NET的表现要优于最先进的方法。我们的源代码可在https://github.com/lwzeng/gc-net上找到。

We propose a geometry constrained network, termed GC-Net, for weakly supervised object localization (WSOL). GC-Net consists of three modules: a detector, a generator and a classifier. The detector predicts the object location defined by a set of coefficients describing a geometric shape (i.e. ellipse or rectangle), which is geometrically constrained by the mask produced by the generator. The classifier takes the resulting masked images as input and performs two complementary classification tasks for the object and background. To make the mask more compact and more complete, we propose a novel multi-task loss function that takes into account area of the geometric shape, the categorical cross-entropy and the negative entropy. In contrast to previous approaches, GC-Net is trained end-to-end and predict object location without any post-processing (e.g. thresholding) that may require additional tuning. Extensive experiments on the CUB-200-2011 and ILSVRC2012 datasets show that GC-Net outperforms state-of-the-art methods by a large margin. Our source code is available at https://github.com/lwzeng/GC-Net.

下载PDF全文

下载文献需遵守相关版权规定

论文标题