论文标题
基于不确定性的视力跨视图地理定位
Uncertainty-aware Vision-based Metric Cross-view Geolocalization
论文作者
论文摘要
本文提出了一种基于视觉的度量跨视图地理定位(CVGL)的新方法,该方法与带有空中图像的地面车辆捕获的摄像机图像匹配以确定车辆的地理位置。由于航空图像以低成本为全球可获得,因此它们代表了两个已建立的自动驾驶范式之间的潜在折衷方案,即使用昂贵的高清先前地图或完全依靠运行时捕获的传感器数据。 我们提出了一个端到端可区分模型,该模型使用地面和空中图像来预测可能的车辆姿势的概率分布。我们将多个车辆数据集与来自正驱动器提供商的航空图像相结合,我们证明了我们的方法的可行性。由于地面真理通常是不准确的W.R.T.航空图像,我们实施了一种伪标签方法,以产生更准确的地面真理构成并使它们公开可用。 虽然先前的工作需要来自目标区域的培训数据才能达到合理的定位准确性(即相同地区的评估),但我们的方法仍能克服此限制,即使在严格挑战性更具挑战性的跨区域案例中,也超越了先前的结果。即使没有测试区域的地面或空中数据,我们也通过较大的利润来提高先前的最先进,这突出了该模型的全球规模应用潜力。我们进一步将不确定性感知的预测整合在跟踪框架中,以确定车辆的轨迹随着时间的推移,导致Kitti-360的平均位置误差为778m。
This paper proposes a novel method for vision-based metric cross-view geolocalization (CVGL) that matches the camera images captured from a ground-based vehicle with an aerial image to determine the vehicle's geo-pose. Since aerial images are globally available at low cost, they represent a potential compromise between two established paradigms of autonomous driving, i.e. using expensive high-definition prior maps or relying entirely on the sensor data captured at runtime. We present an end-to-end differentiable model that uses the ground and aerial images to predict a probability distribution over possible vehicle poses. We combine multiple vehicle datasets with aerial images from orthophoto providers on which we demonstrate the feasibility of our method. Since the ground truth poses are often inaccurate w.r.t. the aerial images, we implement a pseudo-label approach to produce more accurate ground truth poses and make them publicly available. While previous works require training data from the target region to achieve reasonable localization accuracy (i.e. same-area evaluation), our approach overcomes this limitation and outperforms previous results even in the strictly more challenging cross-area case. We improve the previous state-of-the-art by a large margin even without ground or aerial data from the test region, which highlights the model's potential for global-scale application. We further integrate the uncertainty-aware predictions in a tracking framework to determine the vehicle's trajectory over time resulting in a mean position error on KITTI-360 of 0.78m.