论文标题
用于基于图像的3D对象检测的端到端伪LIDAR
End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
论文作者
论文摘要
可靠,准确的3D对象检测是安全自动驾驶的必要条件。尽管LiDAR传感器可以为环境提供准确的3D点云估计值,但对于许多设置,它们也非常昂贵。最近,引入伪LIDAR(PL)导致基于LiDar传感器和基于廉价立体相机的方法之间的准确性差距极大地降低。 PL通过将2D深度图输出转换为3D点云输入,将最新的深神经网络与3D对象检测的最新深度神经网络与3D对象检测。但是,到目前为止,这两个网络必须单独培训。在本文中,我们引入了一个新框架,该框架基于表示形式的可区分变化(COR)模块,该模块允许整个PL Pipeline端对端训练。最终的框架与大多数任务的最先进网络兼容,并且与Pointrcnn结合使用了所有基准的PL始终如一地改进PL-在提交时,基于Kitti Image Image的3D对象检测排行榜上的最高条件是最高的。我们的代码将在https://github.com/mileyan/pseudo-lidar_e2e上提供。
Reliable and accurate 3D object detection is a necessity for safe autonomous driving. Although LiDAR sensors can provide accurate 3D point cloud estimates of the environment, they are also prohibitively expensive for many settings. Recently, the introduction of pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras. PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs. However, so far these two networks have to be trained separately. In this paper, we introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end. The resulting framework is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks -- yielding the highest entry on the KITTI image-based 3D object detection leaderboard at the time of submission. Our code will be made available at https://github.com/mileyan/pseudo-LiDAR_e2e.