单眼可区分的渲染，用于自我监督的3D对象检测

论文标题

单眼可区分的渲染，用于自我监督的3D对象检测

Monocular Differentiable Rendering for Self-Supervised 3D Object Detection

论文作者

Beker, Deniz, Kato, Hiroharu, Morariu, Mihai Adrian, Ando, Takahiro, Matsuoka, Toru, Kehl, Wadim, Gaidon, Adrien

论文摘要

从单眼图像中检测3D对象是由于深度和尺度的投影纠缠而导致的一个问题。为了克服这种歧义，我们提出了一种新颖的自我监督方法，用于借助强大的先验和2D实例掩模，用于纹理3D形状重建并构成刚性对象的估计。我们的方法使用可区分的渲染和从预处理的单眼深度估计网络得出的自我监督的物镜预测图像中每个对象的3D位置和网格。我们使用Kitti 3D对象检测数据集评估该方法的准确性。实验表明，我们可以有效地使用嘈杂的单眼深度和可区分的渲染来替代昂贵的3D地面真实标签或LIDAR信息。

3D object detection from monocular images is an ill-posed problem due to the projective entanglement of depth and scale. To overcome this ambiguity, we present a novel self-supervised method for textured 3D shape reconstruction and pose estimation of rigid objects with the help of strong shape priors and 2D instance masks. Our method predicts the 3D location and meshes of each object in an image using differentiable rendering and a self-supervised objective derived from a pretrained monocular depth estimation network. We use the KITTI 3D object detection dataset to evaluate the accuracy of the method. Experiments demonstrate that we can effectively use noisy monocular depth and differentiable rendering as an alternative to expensive 3D ground-truth labels or LiDAR information.

下载PDF全文

下载文献需遵守相关版权规定

论文标题