戏剧：驾驶的联合风险本地化和字幕

论文标题

戏剧：驾驶的联合风险本地化和字幕

DRAMA: Joint Risk Localization and Captioning in Driving

论文作者

Malla, Srikanth, Choi, Chiho, Dwivedi, Isht, Choi, Joon Hee, Li, Jiachen

论文摘要

考虑到安全至关重要自动化系统中情境意识的功能，对驾驶场景的风险感知及其解释性对于自主和合作驾驶特别重要。为了实现这一目标，本文提出了在驾驶场景中的共同风险定位的新研究方向，其风险解释是一种自然语言描述。由于缺乏标准基准，我们收集了一个大规模的数据集（带有字幕模块的驾驶风险评估机制），该数据集由17,785个在日本东京收集的互动驾驶场景组成。我们的戏剧数据集可容纳视频和对象级别的问题，以驱动风险具有相关的重要对象，以实现视觉字幕的目标，作为一种自由形式的语言描述，利用封闭式和开放式响应用于多级问题，可用于评估驱动场景中的一系列视觉字幕字幕功能。我们将这些数据提供给社区以进行进一步研究。使用戏剧，我们探索了在互动驾驶场景中的共同风险定位和字幕的多个方面。特别是，我们基于各种多任务预测架构，并详细分析了联合风险定位和风险字幕。数据集可从https://usa.honda-ri.com/drama获得

Considering the functionality of situational awareness in safety-critical automation systems, the perception of risk in driving scenes and its explainability is of particular importance for autonomous and cooperative driving. Toward this goal, this paper proposes a new research direction of joint risk localization in driving scenes and its risk explanation as a natural language description. Due to the lack of standard benchmarks, we collected a large-scale dataset, DRAMA (Driving Risk Assessment Mechanism with A captioning module), which consists of 17,785 interactive driving scenarios collected in Tokyo, Japan. Our DRAMA dataset accommodates video- and object-level questions on driving risks with associated important objects to achieve the goal of visual captioning as a free-form language description utilizing closed and open-ended responses for multi-level questions, which can be used to evaluate a range of visual captioning capabilities in driving scenarios. We make this data available to the community for further research. Using DRAMA, we explore multiple facets of joint risk localization and captioning in interactive driving scenarios. In particular, we benchmark various multi-task prediction architectures and provide a detailed analysis of joint risk localization and risk captioning. The data set is available at https://usa.honda-ri.com/drama

下载PDF全文

下载文献需遵守相关版权规定

论文标题