BevFusion：一个简单且强大的激光灯光摄像机融合框架

论文标题

BevFusion：一个简单且强大的激光灯光摄像机融合框架

BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework

论文作者

Liang, Tingting, Xie, Hongwei, Yu, Kaicheng, Xia, Zhongyu, Lin, Zhiwei, Wang, Yongtao, Tang, Tao, Wang, Bing, Tang, Zhi

论文摘要

融合相机和激光痛的信息已成为3D对象检测任务的事实上的标准。当前方法依赖于激光雷达传感器的点云作为查询来利用图像空间中的特征。但是，人们发现，这种基本的假设使当前的融合框架无法在发生激光症故障时产生任何预测，而不论次要或大型。从根本上讲，这将部署能力限制为现实的自动驾驶场景。相比之下，我们提出了一个令人惊讶的简单但新颖的融合框架，称为BevFusion，其相机流不取决于LiDAR数据的输入，从而解决了先前方法的缺点。我们从经验上表明，我们的框架超过了正常培训设置下的最新方法。在模拟各种激光雷达故障的稳健性训练设置下，我们的框架显着超过了最新方法，地图的图15.7％至28.9％。据我们所知，我们是第一个处理现实的LiDAR故障的人，可以在无需任何后处理过程的情况下将其部署到现实情况。该代码可从https://github.com/adlab-autodrive/bevfusion获得。

Fusing the camera and LiDAR information has become a de-facto standard for 3D object detection tasks. Current methods rely on point clouds from the LiDAR sensor as queries to leverage the feature from the image space. However, people discovered that this underlying assumption makes the current fusion framework infeasible to produce any prediction when there is a LiDAR malfunction, regardless of minor or major. This fundamentally limits the deployment capability to realistic autonomous driving scenarios. In contrast, we propose a surprisingly simple yet novel fusion framework, dubbed BEVFusion, whose camera stream does not depend on the input of LiDAR data, thus addressing the downside of previous methods. We empirically show that our framework surpasses the state-of-the-art methods under the normal training settings. Under the robustness training settings that simulate various LiDAR malfunctions, our framework significantly surpasses the state-of-the-art methods by 15.7% to 28.9% mAP. To the best of our knowledge, we are the first to handle realistic LiDAR malfunction and can be deployed to realistic scenarios without any post-processing procedure. The code is available at https://github.com/ADLab-AutoDrive/BEVFusion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题