pedformer：通过跨模式注意调制和门控多任务学习的行人行为预测

论文标题

pedformer：通过跨模式注意调制和门控多任务学习的行人行为预测

PedFormer: Pedestrian Behavior Prediction via Cross-Modal Attention Modulation and Gated Multitask Learning

论文作者

Rasouli, Amir, Kotseruba, Iuliia

论文摘要

对于智能驾驶系统来说，预测行人行为是一项至关重要的任务。准确的预测需要对各种上下文元素有深入的了解，这可能会影响行人行为的方式。为了应对这一挑战，我们提出了一个新颖的框架，该框架依靠不同的数据方式来从以自我为中心的角度来预测行人的未来轨迹和跨越行动。具体而言，我们的模型利用跨模式变压器体系结构来捕获不同数据类型之间的依赖关系。通过语义专注的相互作用模块生成的行人和自我 - 车辆动力学条件的行人与其他交通剂之间的相互作用的表示，可以增强变压器的输出。最后，上下文编码使用封闭式网络馈入多流解码器框架。我们对公共行人行为基准（PIE和JAAD）评估了算法，并表明我们的模型将轨迹和行动预测的最新预测提高了多达22％和13％的各种指标。通过广泛的消融研究研究了我们模型组成部分带来的优势。

Predicting pedestrian behavior is a crucial task for intelligent driving systems. Accurate predictions require a deep understanding of various contextual elements that potentially impact the way pedestrians behave. To address this challenge, we propose a novel framework that relies on different data modalities to predict future trajectories and crossing actions of pedestrians from an ego-centric perspective. Specifically, our model utilizes a cross-modal Transformer architecture to capture dependencies between different data types. The output of the Transformer is augmented with representations of interactions between pedestrians and other traffic agents conditioned on the pedestrian and ego-vehicle dynamics that are generated via a semantic attentive interaction module. Lastly, the context encodings are fed into a multi-stream decoder framework using a gated-shared network. We evaluate our algorithm on public pedestrian behavior benchmarks, PIE and JAAD, and show that our model improves state-of-the-art in trajectory and action prediction by up to 22% and 13% respectively on various metrics. The advantages brought by components of our model are investigated via extensive ablation studies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题