论文标题

深入研究自我监督的单眼估计的概括

Deep Digging into the Generalization of Self-Supervised Monocular Depth Estimation

论文作者

Bae, Jinwoo, Moon, Sungho, Im, Sunghoon

论文摘要

最近对自我监督的单眼深度估计进行了广泛的研究。大多数工作都集中在提高基数基准数据集上的性能,但提供了一些有关概括性能的实验。在本文中,我们将骨干网络(例如CNN,变形金刚和CNN-TransFormer混合模型)研究到单眼深度估计的概括。我们首先评估了各种公共数据集的最新模型,这些模型在网络培训中从未见过。接下来,我们使用我们生成的各种纹理移位数据集研究了质地偏见和形状偏置表示的影响。我们观察到变压器表现出强大的形状偏差,而CNN会产生较强的质地偏置。我们还发现,与纹理偏见的模型相比,形状偏见的模型对单眼深度估计的概括性能更好。基于这些观察结果,我们新设计一个具有多级自适应特征融合模块的CNN转换杂种网络,称为Monoformer。单成型背后的设计直觉是通过使用变压器来增加形状偏差,同时通过自适应融合多级表示来补偿变压器的弱点偏置。广泛的实验表明,所提出的方法通过各种公共数据集实现了最先进的性能。我们的方法还显示了竞争方法中最佳的概括能力。

Self-supervised monocular depth estimation has been widely studied recently. Most of the work has focused on improving performance on benchmark datasets, such as KITTI, but has offered a few experiments on generalization performance. In this paper, we investigate the backbone networks (e.g. CNNs, Transformers, and CNN-Transformer hybrid models) toward the generalization of monocular depth estimation. We first evaluate state-of-the-art models on diverse public datasets, which have never been seen during the network training. Next, we investigate the effects of texture-biased and shape-biased representations using the various texture-shifted datasets that we generated. We observe that Transformers exhibit a strong shape bias and CNNs do a strong texture-bias. We also find that shape-biased models show better generalization performance for monocular depth estimation compared to texture-biased models. Based on these observations, we newly design a CNN-Transformer hybrid network with a multi-level adaptive feature fusion module, called MonoFormer. The design intuition behind MonoFormer is to increase shape bias by employing Transformers while compensating for the weak locality bias of Transformers by adaptively fusing multi-level representations. Extensive experiments show that the proposed method achieves state-of-the-art performance with various public datasets. Our method also shows the best generalization ability among the competitive methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源