多模式驱动程序行为理解的决策级融合的比较分析

论文标题

多模式驱动程序行为理解的决策级融合的比较分析

A Comparative Analysis of Decision-Level Fusion for Multimodal Driver Behaviour Understanding

论文作者

Roitberg, Alina, Peng, Kunyu, Marinov, Zdravko, Seibold, Constantin, Schneider, David, Stiefelhagen, Rainer

论文摘要

车厢内的视觉识别会导致更安全的驾驶和更直观的人车相互作用，但是这些系统面临着实质性的障碍，因为它们需要捕获不同的驾驶员行为粒度，同时处理高度有限的身体可见性和变化的照明。多模式识别减轻了许多此类问题：由于不同方式特定的优势和劣势，不同传感器的预测结果相互补充。尽管在先前发布的框架中已经考虑了几种晚期融合方法，但它们不断具有不同的体系结构骨干和构建块，因此很难隔离所选晚期融合策略本身的作用。本文对基于视频的驾驶员观察中的决策级后期融合的不同范式进行了经验评估。我们比较了七种不同的机制，用于连接既流行的单模式分类器的结果（例如得分平均），尚未考虑在驱动程序观察中根据不同的标准和基准测试的设置来评估它们的情况下（例如等级级融合）。这是对车辆内部多模式预测变量融合结果的策略的首次系统研究，该策略的目标是为融合方案选择提供指导。

Visual recognition inside the vehicle cabin leads to safer driving and more intuitive human-vehicle interaction but such systems face substantial obstacles as they need to capture different granularities of driver behaviour while dealing with highly limited body visibility and changing illumination. Multimodal recognition mitigates a number of such issues: prediction outcomes of different sensors complement each other due to different modality-specific strengths and weaknesses. While several late fusion methods have been considered in previously published frameworks, they constantly feature different architecture backbones and building blocks making it very hard to isolate the role of the chosen late fusion strategy itself. This paper presents an empirical evaluation of different paradigms for decision-level late fusion in video-based driver observation. We compare seven different mechanisms for joining the results of single-modal classifiers which have been both popular, (e.g. score averaging) and not yet considered (e.g. rank-level fusion) in the context of driver observation evaluating them based on different criteria and benchmark settings. This is the first systematic study of strategies for fusing outcomes of multimodal predictors inside the vehicles, conducted with the goal to provide guidance for fusion scheme selection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题