从聚类的角度来看时间戳监督的动作分割

论文标题

从聚类的角度来看时间戳监督的动作分割

Timestamp-Supervised Action Segmentation from the Perspective of Clustering

论文作者

Du, Dazhao, Li, Enhan, Si, Lingyu, Xu, Fanjiang, Sun, Fuchun

论文摘要

由于注释成本较低，在时间戳监督下进行的视频动作细分最近受到了很多关注。大多数现有方法都为每个视频中的所有帧生成伪标记，以训练分割模型。但是，这些方法遭受了不正确的伪标记，尤其是对于两个连续动作之间的过渡区域中语义不清的帧，我们称之为模棱两可的间隔。为了解决这个问题，我们从聚类的角度提出了一个新颖的框架，其中包括以下两个部分。首先，伪标签结合产生不完整但高质量的伪标签序列，其中模棱两可的间隔中没有伪标记。其次，迭代聚类迭代地通过聚类将伪标记传播到模棱两可的间隔，从而更新伪标签序列以训练模型。我们进一步引入了聚类损失，该损失鼓励在同一动作段中更紧凑的框架特征。广泛的实验显示了我们方法的有效性。

Video action segmentation under timestamp supervision has recently received much attention due to lower annotation costs. Most existing methods generate pseudo-labels for all frames in each video to train the segmentation model. However, these methods suffer from incorrect pseudo-labels, especially for the semantically unclear frames in the transition region between two consecutive actions, which we call ambiguous intervals. To address this issue, we propose a novel framework from the perspective of clustering, which includes the following two parts. First, pseudo-label ensembling generates incomplete but high-quality pseudo-label sequences, where the frames in ambiguous intervals have no pseudo-labels. Second, iterative clustering iteratively propagates the pseudo-labels to the ambiguous intervals by clustering, and thus updates the pseudo-label sequences to train the model. We further introduce a clustering loss, which encourages the features of frames within the same action segment more compact. Extensive experiments show the effectiveness of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题