MMFN：用于端到端驾驶的多模式融合网

论文标题

MMFN：用于端到端驾驶的多模式融合网

MMFN: Multi-Modal-Fusion-Net for End-to-End Driving

论文作者

Zhang, Qingwen, Tang, Mingkai, Geng, Ruoyu, Chen, Feiyi, Xin, Ren, Wang, Lujia

论文摘要

受到人类使用各种感觉器官感知世界的事实的启发，具有不同方式的传感器在端到端驾驶中部署，以获得3D场景的全球环境。在以前的作品中，相机和激光镜输入通过变压器融合，以更好地驾驶性能。通常将这些输入进一步解释为高级地图信息，以帮助导航任务。然而，从复杂地图输入中提取有用的信息很具有挑战性，因为冗余信息可能会误导代理商并对驾驶性能产生负面影响。我们提出了一种新颖的方法，可以从矢量化的高清（HD）地图中有效提取特征，并将其用于端到端驾驶任务。此外，我们设计了一个新的专家，以通过考虑多道路规则来进一步提高模型性能。实验结果证明，与其他方法相比，这两种提出的改进都使我们的代理能够实现卓越的性能。

Inspired by the fact that humans use diverse sensory organs to perceive the world, sensors with different modalities are deployed in end-to-end driving to obtain the global context of the 3D scene. In previous works, camera and LiDAR inputs are fused through transformers for better driving performance. These inputs are normally further interpreted as high-level map information to assist navigation tasks. Nevertheless, extracting useful information from the complex map input is challenging, for redundant information may mislead the agent and negatively affect driving performance. We propose a novel approach to efficiently extract features from vectorized High-Definition (HD) maps and utilize them in the end-to-end driving tasks. In addition, we design a new expert to further enhance the model performance by considering multi-road rules. Experimental results prove that both of the proposed improvements enable our agent to achieve superior performance compared with other methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题