论文标题
基于骨架的动作分割,具有多阶段时空图形卷积神经网络
Skeleton-Based Action Segmentation with Multi-Stage Spatial-Temporal Graph Convolutional Neural Networks
论文作者
论文摘要
在运动捕获序列中识别和时间段细分细粒作用的能力对于在人类运动分析中的应用至关重要。运动捕获通常是通过光学或惯性测量系统进行的,该系统将人类运动编码为人类关节位置和方向或其高阶表示的时间序列。最新的动作细分方法使用了时间卷积的多个阶段。主要思想是通过几层时间卷积产生初始预测,并在多个阶段(以及时间卷积)中完善这些预测。尽管这些方法捕获了长期的时间模式,但最初的预测并不能充分考虑人类关节之间的空间层次结构。为了解决这一限制,我们最近引入了多阶段的时空图卷积神经网络(MS-GCN)。我们的框架用空间图卷积和扩张时间卷积代替了时间卷积的初始阶段,从而更好地利用了关节的空间构型及其长期的时间动力学。将我们的框架与四个强大的基线进行了比较。实验结果表明,我们的框架是基于骨架的动作分割的强大基线。
The ability to identify and temporally segment fine-grained actions in motion capture sequences is crucial for applications in human movement analysis. Motion capture is typically performed with optical or inertial measurement systems, which encode human movement as a time series of human joint locations and orientations or their higher-order representations. State-of-the-art action segmentation approaches use multiple stages of temporal convolutions. The main idea is to generate an initial prediction with several layers of temporal convolutions and refine these predictions over multiple stages, also with temporal convolutions. Although these approaches capture long-term temporal patterns, the initial predictions do not adequately consider the spatial hierarchy among the human joints. To address this limitation, we recently introduced multi-stage spatial-temporal graph convolutional neural networks (MS-GCN). Our framework replaces the initial stage of temporal convolutions with spatial graph convolutions and dilated temporal convolutions, which better exploit the spatial configuration of the joints and their long-term temporal dynamics. Our framework was compared to four strong baselines on five tasks. Experimental results demonstrate that our framework is a strong baseline for skeleton-based action segmentation.