密集的预测变压器，用于单眼视觉尺度的比例估计

论文标题

密集的预测变压器，用于单眼视觉尺度的比例估计

Dense Prediction Transformer for Scale Estimation in Monocular Visual Odometry

论文作者

Françani, André O., Maximo, Marcos R. O. A.

论文摘要

单眼视觉探针计包括通过单个相机图像对代理的位置的估计，并将其应用于自动驾驶汽车，医疗机器人和增强现实中。但是，由于2D帧中缺乏深度信息，因此单眼系统遭受了规模歧义问题。本文通过显示密集预测变压器模型在单眼视觉音仪系统中的规模估计的应用来做出贡献。实验结果表明，通过该模型对深度图的准确估算，可以减少单眼系统的尺度漂移问题，从而在视觉探测基准上实现竞争性的最新性能。

Monocular visual odometry consists of the estimation of the position of an agent through images of a single camera, and it is applied in autonomous vehicles, medical robots, and augmented reality. However, monocular systems suffer from the scale ambiguity problem due to the lack of depth information in 2D frames. This paper contributes by showing an application of the dense prediction transformer model for scale estimation in monocular visual odometry systems. Experimental results show that the scale drift problem of monocular systems can be reduced through the accurate estimation of the depth map by this model, achieving competitive state-of-the-art performance on a visual odometry benchmark.

下载PDF全文

下载文献需遵守相关版权规定

论文标题