致力于通用几何形状的单视深度估计

论文标题

致力于通用几何形状的单视深度估计

Towards General Purpose Geometry-Preserving Single-View Depth Estimation

论文作者

Romanov, Mikhail, Patatkin, Nikolay, Vorontsova, Anna, Nikolenko, Sergey, Konushin, Anton, Senyushkin, Dmitry

论文摘要

单视深度估计（SVDE）在AR应用，3D建模和机器人技术的场景理解中起着至关重要的作用，从而提供了基于单个图像的场景的几何形状。最近的工作表明，成功的解决方案强烈依赖于培训数据的多样性和数量。这些数据可以来自立体电影和照片。但是，它们不提供几何完整的深度图（因为差异包含未知的移位值）。因此，经过此数据培训的现有模型无法恢复正确的3D表示。我们的工作表明，经过此数据训练的模型以及常规数据集可以在预测正确的场景几何形状的同时获得准确性。出乎意料的是，只需要一小部分几何正确的深度图才能训练一个模型，该模型同样地执行了在完整的几何正确数据集中训练的模型。之后，我们使用建议的方法在数据集的混合物上训练计算高效的模型。通过对3D点云的完全看不见的数据集和定性比较的定量比较，我们表明我们的模型定义了通用SVDE的新最新技术。

Single-view depth estimation (SVDE) plays a crucial role in scene understanding for AR applications, 3D modeling, and robotics, providing the geometry of a scene based on a single image. Recent works have shown that a successful solution strongly relies on the diversity and volume of training data. This data can be sourced from stereo movies and photos. However, they do not provide geometrically complete depth maps (as disparities contain unknown shift value). Therefore, existing models trained on this data are not able to recover correct 3D representations. Our work shows that a model trained on this data along with conventional datasets can gain accuracy while predicting correct scene geometry. Surprisingly, only a small portion of geometrically correct depth maps are required to train a model that performs equally to a model trained on the full geometrically correct dataset. After that, we train computationally efficient models on a mixture of datasets using the proposed method. Through quantitative comparison on completely unseen datasets and qualitative comparison of 3D point clouds, we show that our model defines the new state of the art in general-purpose SVDE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题