论文标题

垂直切割

Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth

论文作者

Kim, Doyeon, Ka, Woonghyun, Ahn, Pyungwhan, Joo, Donggyu, Chun, Sehwan, Kim, Junmo

论文摘要

来自单个图像的深度估计是一项重要任务,可以应用于计算机视觉中的各个领域,并且随着卷积神经网络的发展而迅速发展。在本文中,我们提出了一种新颖的结构和培训策略,以实现单眼深度估计,以进一步提高网络的预测准确性。我们部署了一个层次变压器编码器来捕获和传达全局上下文,并设计了轻巧而功能强大的解码器,以在考虑局部连接时生成估计的深度图。通过使用我们提出的选择性功能融合模块,通过在多尺度本地特征和全局解码流之间构建连接的路径,该网络可以集成表示并恢复细节。此外,所提出的解码器比以前提出的解码器显示出更好的性能,计算复杂性较小。此外,我们通过利用深度估算的重要观察结果来增强模型,从而改善了深度特异性的增强方法。我们的网络在具有挑战性的深度数据集纽约大学深度V2上实现了最新的性能。已经进行了广泛的实验来验证和显示所提出方法的有效性。最后,我们的模型比其他比较模型显示出更好的概括能力和鲁棒性。

Depth estimation from a single image is an important task that can be applied to various fields in computer vision, and has grown rapidly with the development of convolutional neural networks. In this paper, we propose a novel structure and training strategy for monocular depth estimation to further improve the prediction accuracy of the network. We deploy a hierarchical transformer encoder to capture and convey the global context, and design a lightweight yet powerful decoder to generate an estimated depth map while considering local connectivity. By constructing connected paths between multi-scale local features and the global decoding stream with our proposed selective feature fusion module, the network can integrate both representations and recover fine details. In addition, the proposed decoder shows better performance than the previously proposed decoders, with considerably less computational complexity. Furthermore, we improve the depth-specific augmentation method by utilizing an important observation in depth estimation to enhance the model. Our network achieves state-of-the-art performance over the challenging depth dataset NYU Depth V2. Extensive experiments have been conducted to validate and show the effectiveness of the proposed approach. Finally, our model shows better generalisation ability and robustness than other comparative models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源