使用对抗域特征适应的夜间图像的无监督的单眼深度估计

论文标题

使用对抗域特征适应的夜间图像的无监督的单眼深度估计

Unsupervised Monocular Depth Estimation for Night-time Images using Adversarial Domain Feature Adaptation

论文作者

Vankadari, Madhu, Garg, Sourav, Majumder, Anima, Kumar, Swagat, Behera, Ardhendu

论文摘要

在本文中，我们研究了从无约束的RGB单眼图像估算每个像素深度图的问题，这是一项艰巨的任务，在文献中尚未得到充分解决。由于夜间图像的测试，由于它们之间的较大域移动，因此，在夜间图像进行测试时，最新的白天深度估计方法失败了。用于训练这些网络的通常的照片度量损失可能无法用于夜间图像，因为没有均匀的照明，这通常是在白天图像中存在的，因此很难解决问题。我们建议通过将其作为域适应性问题提出来解决此问题，在该问题中，经过日常图像训练的网络适用于夜间图像。具体而言，对编码器进行了训练，从而从夜间图像中生成特征，这些特征与使用基于Patchgan的对抗性歧视性学习方法从白天图像获得的图像没有区别。与直接调整深度预测（网络输出）的现有方法不同，我们建议调整从编码器网络获得的特征图，以便可以直接使用预先训练的日间深度解码器来预测这些适应性特征的深度。因此，所得方法被称为“对抗域特征适应（ADFA）”，通过对挑战性的牛津夜间驾驶数据集进行实验来证明其功效。此外，提出的ADFA方法的模块化编码器架构使我们可以使用编码器模块作为特征提取器，可用于许多其他应用程序。证明了一个这样的应用程序，其中显示从我们改编的编码网络获得的功能显示出在视觉场所识别问题中优于其他最先进的方法，从而进一步建立了所提出方法的有用性和有效性。

In this paper, we look into the problem of estimating per-pixel depth maps from unconstrained RGB monocular night-time images which is a difficult task that has not been addressed adequately in the literature. The state-of-the-art day-time depth estimation methods fail miserably when tested with night-time images due to a large domain shift between them. The usual photo metric losses used for training these networks may not work for night-time images due to the absence of uniform lighting which is commonly present in day-time images, making it a difficult problem to solve. We propose to solve this problem by posing it as a domain adaptation problem where a network trained with day-time images is adapted to work for night-time images. Specifically, an encoder is trained to generate features from night-time images that are indistinguishable from those obtained from day-time images by using a PatchGAN-based adversarial discriminative learning method. Unlike the existing methods that directly adapt depth prediction (network output), we propose to adapt feature maps obtained from the encoder network so that a pre-trained day-time depth decoder can be directly used for predicting depth from these adapted features. Hence, the resulting method is termed as "Adversarial Domain Feature Adaptation (ADFA)" and its efficacy is demonstrated through experimentation on the challenging Oxford night driving dataset. Also, The modular encoder-decoder architecture for the proposed ADFA method allows us to use the encoder module as a feature extractor which can be used in many other applications. One such application is demonstrated where the features obtained from our adapted encoder network are shown to outperform other state-of-the-art methods in a visual place recognition problem, thereby, further establishing the usefulness and effectiveness of the proposed approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题