相机姿势问题：通过减轻姿势分布偏置来改善深度预测

论文标题

相机姿势问题：通过减轻姿势分布偏置来改善深度预测

Camera Pose Matters: Improving Depth Prediction by Mitigating Pose Distribution Bias

论文作者

Zhao, Yunhan, Kong, Shu, Fowlkes, Charless

论文摘要

单眼深度预测因子通常是在自然偏见的摄像头姿势分布的大规模训练集上训练的。结果，受过训练的预测因素无法为在不常见的相机姿势下捕获的测试示例做出可靠的深度预测。为了解决这个问题，我们提出了两种在训练和预测过程中利用相机姿势的新型技术。首先，我们介绍了一个简单的视角吸引数据扩展，该数据通过以几何一致的方式扰动现有观点来综合新的培训示例。其次，我们提出了一个条件模型，该模型通过将其作为输入的一部分编码来利用每张图像摄像头姿势作为先验知识。我们表明，共同应用两种方法可以改善在罕见下捕获的图像，甚至从未见过的相机姿势的图像上的深度预测。我们表明，当应用于一系列不同的预测器体系结构时，我们的方法可以提高性能。最后，我们表明，明确编码相机姿势分布会在对真实图像进行评估时改善合成训练的深度预测器的概括性能。

Monocular depth predictors are typically trained on large-scale training sets which are naturally biased w.r.t the distribution of camera poses. As a result, trained predictors fail to make reliable depth predictions for testing examples captured under uncommon camera poses. To address this issue, we propose two novel techniques that exploit the camera pose during training and prediction. First, we introduce a simple perspective-aware data augmentation that synthesizes new training examples with more diverse views by perturbing the existing ones in a geometrically consistent manner. Second, we propose a conditional model that exploits the per-image camera pose as prior knowledge by encoding it as a part of the input. We show that jointly applying the two methods improves depth prediction on images captured under uncommon and even never-before-seen camera poses. We show that our methods improve performance when applied to a range of different predictor architectures. Lastly, we show that explicitly encoding the camera pose distribution improves the generalization performance of a synthetically trained depth predictor when evaluated on real images.

下载PDF全文

下载文献需遵守相关版权规定

论文标题