论文标题

什么时候发生的? vlogs中叙述性动作的持续时间的时间定位

When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs

论文作者

Ignat, Oana, Castro, Santiago, Zhou, Yuhang, Bao, Jiajun, Shan, Dandan, Mihalcea, Rada

论文摘要

我们考虑人类行动在生活方式视频博客中定位的任务。我们介绍了一个新颖的数据集,该数据集由1,200个视频剪辑中的13,000个叙述性动作的时间定位的手动注释组成。我们对这些数据进行了广泛的分析,这使我们能够更好地了解语言和视觉方式如何在整个视频中相互作用。我们提出了一种简单而有效的方法,以根据其预期的持续时间本地定位叙述的动作。通过几个实验和分析,我们表明我们的方法带来了有关以前方法的互补信息,并导致对时间动作定位任务的先前工作改进。

We consider the task of temporal human action localization in lifestyle vlogs. We introduce a novel dataset consisting of manual annotations of temporal localization for 13,000 narrated actions in 1,200 video clips. We present an extensive analysis of this data, which allows us to better understand how the language and visual modalities interact throughout the videos. We propose a simple yet effective method to localize the narrated actions based on their expected duration. Through several experiments and analyses, we show that our method brings complementary information with respect to previous methods, and leads to improvements over previous work for the task of temporal action localization.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源