从识别到预测：人类行动和视频中的轨迹预测的分析

论文标题

从识别到预测：人类行动和视频中的轨迹预测的分析

From Recognition to Prediction: Analysis of Human Action and Trajectory Prediction in Video

论文作者

Liang, Junwei

论文摘要

随着计算机视觉深度学习的进步，系统现在能够分析从视频中进行前所未有的丰富视觉信息，以启用诸如自主驾驶，社会意识的机器人助理和公共安全监控等应用程序。破译人类的行为以预测其未来的道路/轨迹及其从视频中所做的事情，在这些应用中很重要。但是，人类轨迹预测仍然是一项艰巨的任务，因为场景语义和人类意图很难建模。许多系统没有提供有关行人未来的高级语义属性。该设计阻碍了来自不同域和看不见的情况的视频数据中的预测性能。为了实现最佳的未来人类行为预测，对于系统，能够检测和分析人类活动和场景语义，将信息特征传递给后续预测模块以进行上下文理解至关重要。

With the advancement in computer vision deep learning, systems now are able to analyze an unprecedented amount of rich visual information from videos to enable applications such as autonomous driving, socially-aware robot assistant and public safety monitoring. Deciphering human behaviors to predict their future paths/trajectories and what they would do from videos is important in these applications. However, human trajectory prediction still remains a challenging task, as scene semantics and human intent are difficult to model. Many systems do not provide high-level semantic attributes to reason about pedestrian future. This design hinders prediction performance in video data from diverse domains and unseen scenarios. To enable optimal future human behavioral forecasting, it is crucial for the system to be able to detect and analyze human activities as well as scene semantics, passing informative features to the subsequent prediction module for context understanding.

下载PDF全文

下载文献需遵守相关版权规定

论文标题