论文标题

多视图教学视频中的弱监督在线操作细分

Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos

论文作者

Ghoddoosian, Reza, Dwivedi, Isht, Agarwal, Nakul, Choi, Chiho, Dariush, Behzad

论文摘要

本文解决了教学视频中弱监督在线行动细分的新问题。我们提出了一个框架,可以使用动态编程在测试时间在线进行流式传输视频,并显示其优于贪婪的滑动窗口方法。我们通过引入在线途径差异损失(OODL)来鼓励分割结果具有更高的时间一致性来改善我们的框架。此外,只有在培训期间,我们在多个视图之间利用框架的对应关系,作为培训弱标记的教学视频的监督。特别是,我们研究了三种不同的多视图推理技术,以生成更准确的框架伪基真实性,而没有额外的注释成本。我们介绍了两个基准多视图数据集,早餐和宜家ASM的结果和消融研究。实验结果表明,在两个烹饪和组装的两个领域,所提出的方法在定性和定量上的功效。

This paper addresses a new problem of weakly-supervised online action segmentation in instructional videos. We present a framework to segment streaming videos online at test time using Dynamic Programming and show its advantages over greedy sliding window approach. We improve our framework by introducing the Online-Offline Discrepancy Loss (OODL) to encourage the segmentation results to have a higher temporal consistency. Furthermore, only during training, we exploit frame-wise correspondence between multiple views as supervision for training weakly-labeled instructional videos. In particular, we investigate three different multi-view inference techniques to generate more accurate frame-wise pseudo ground-truth with no additional annotation cost. We present results and ablation studies on two benchmark multi-view datasets, Breakfast and IKEA ASM. Experimental results show efficacy of the proposed methods both qualitatively and quantitatively in two domains of cooking and assembly.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源