基于检索的长期视觉定位的域不变相似性激活图对比度学习

论文标题

基于检索的长期视觉定位的域不变相似性激活图对比度学习

Domain-invariant Similarity Activation Map Contrastive Learning for Retrieval-based Long-term Visual Localization

论文作者

Hu, Hanjiang, Wang, Hesheng, Liu, Zhe, Chen, Weidong

论文摘要

视觉定位是移动机器人和自动驾驶的应用中的关键组成部分。图像检索是基于图像的本地化方法中的一种有效技术。由于环境条件的巨大变化，例如照明，季节性和天气变化，基于检索的视觉定位受到严重影响，并成为一个具有挑战性的问题。在这项工作中，首先是通过多域图像翻译提取域不变特征的一般体系结构。然后将新颖的梯度加权相似性激活映射损失（Grad-SAM）纳入了更高精度的精细定位。我们还提出了一种新的自适应三胞胎损失，以自我监督的方式增强对嵌入的对比度学习。最终的粗到1个图像检索管道被实现为没有和Grad-SAM损失的模型的顺序组合。已经进行了广泛的实验，以验证拟议方法在CMUS季节数据集中的有效性。使用在CMU-SPOOVEL数据集的城市部分中预先训练的模型，在机器人数据集上验证了我们方法的强大概括能力。我们的性能与中等或高精度的基于图像的本地化基线相提并论，尤其是在具有照明差异，植被和夜间图像的具有挑战性的环境下。代码和预估计的模型可在https://github.com/hanjianghu/disam上找到。

Visual localization is a crucial component in the application of mobile robot and autonomous driving. Image retrieval is an efficient and effective technique in image-based localization methods. Due to the drastic variability of environmental conditions, e.g. illumination, seasonal and weather changes, retrieval-based visual localization is severely affected and becomes a challenging problem. In this work, a general architecture is first formulated probabilistically to extract domain invariant feature through multi-domain image translation. And then a novel gradient-weighted similarity activation mapping loss (Grad-SAM) is incorporated for finer localization with high accuracy. We also propose a new adaptive triplet loss to boost the contrastive learning of the embedding in a self-supervised manner. The final coarse-to-fine image retrieval pipeline is implemented as the sequential combination of models without and with Grad-SAM loss. Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMUSeasons dataset. The strong generalization ability of our approach is verified on RobotCar dataset using models pre-trained on urban part of CMU-Seasons dataset. Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision, especially under the challenging environments with illumination variance, vegetation and night-time images. The code and pretrained models are available on https://github.com/HanjiangHu/DISAM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题