好：探索用于检测开放世界中对象的几何提示

论文标题

好：探索用于检测开放世界中对象的几何提示

GOOD: Exploring Geometric Cues for Detecting Objects in an Open World

论文作者

Huang, Haiwen, Geiger, Andreas, Zhang, Dan

论文摘要

我们通过从有限数量的基本对象类中学习来检测图像中的每个对象的开放世界类别对象检测的任务。最先进的基于RGB的模型遭受了过度拟合培训课程的折磨，并且常常无法检测出新颖的物体。这是因为基于RGB的模型主要依靠外观相似性来检测新颖的对象，并且也容易拟合过度拟合纹理和歧视性零件等较短的提示。为了解决基于RGB的对象探测器的这些缺点，我们提出了结合总通用单眼估计器预测的几何线索，例如深度和正常线索。具体而言，我们使用几何提示来训练对象提案网络，以在训练集中为伪标记的未注释的新颖对象训练。我们由此产生的几何形状引导的开放世界对象检测器（良好）可显着改善新颖对象类别的检测回忆，并且仅在几个培训类别中表现良好。使用单个“人”类在可可数据集上进行培训，Good超过SOTA方法@100@100，相对改善为24％。

We address the task of open-world class-agnostic object detection, i.e., detecting every object in an image by learning from a limited number of base object classes. State-of-the-art RGB-based models suffer from overfitting the training classes and often fail at detecting novel-looking objects. This is because RGB-based models primarily rely on appearance similarity to detect novel objects and are also prone to overfitting short-cut cues such as textures and discriminative parts. To address these shortcomings of RGB-based object detectors, we propose incorporating geometric cues such as depth and normals, predicted by general-purpose monocular estimators. Specifically, we use the geometric cues to train an object proposal network for pseudo-labeling unannotated novel objects in the training set. Our resulting Geometry-guided Open-world Object Detector (GOOD) significantly improves detection recall for novel object categories and already performs well with only a few training classes. Using a single "person" class for training on the COCO dataset, GOOD surpasses SOTA methods by 5.0% AR@100, a relative improvement of 24%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题