论文标题
在时空图中学习限制的动态相关性,用于运动预测
Learning Constrained Dynamic Correlations in Spatiotemporal Graphs for Motion Prediction
论文作者
论文摘要
由于复杂的时空特征建模,人类运动预测具有挑战性。在所有方法中,图形卷积网络(GCN)被广泛利用,因为它们在显式连接建模方面具有优势。在GCN中,图相关性邻接矩阵驱动驱动功能集合,并且是提取预测运动特征的关键。最先进的方法将每个帧的时空相关性分解为空间相关性,每个关节的时间相关性。直接参数化这些相关性引入了冗余参数,以表示所有框架和所有关节共享的共同关系。此外,对于不同的运动样品,时空图邻接矩阵相同,不能反映样品对应方差。为了克服这两个瓶颈,我们提出了动态时空分解GC(DSTD-GC),该分解GC(DSTD-GC)仅占最先进的GC的28.6%参数。 DSTD-GC的关键是受约束的动态相关建模,该模型将通用静态约束明确参数为所有帧/关节共享的空间/时间暂时香草邻接矩阵,并通过调整模型功能动态提取每个帧/关节的对应方差。对于每个样本,固定了常见的约束邻接矩阵以表示通用运动模式,而提取的方差则通过特定的模式调整来完成矩阵。同时,我们在数学上将时空图上的GC重新加密为统一形式,并发现DSTD-GC放宽了其他GC的某些限制,这有助于更好地表示能力。通过将DSTD-GC与先验知识相结合,我们提出了一个称为DSTD-GCN的强大时空GCN,该GCN的预测准确性以$ 55.0 \%\%\%\ sim 96.9 \%$ $ $ $ $ $ $ $ $。
Human motion prediction is challenging due to the complex spatiotemporal feature modeling. Among all methods, graph convolution networks (GCNs) are extensively utilized because of their superiority in explicit connection modeling. Within a GCN, the graph correlation adjacency matrix drives feature aggregation and is the key to extracting predictive motion features. State-of-the-art methods decompose the spatiotemporal correlation into spatial correlations for each frame and temporal correlations for each joint. Directly parameterizing these correlations introduces redundant parameters to represent common relations shared by all frames and all joints. Besides, the spatiotemporal graph adjacency matrix is the same for different motion samples and cannot reflect sample-wise correspondence variances. To overcome these two bottlenecks, we propose dynamic spatiotemporal decompose GC (DSTD-GC), which only takes 28.6% parameters of the state-of-the-art GC. The key of DSTD-GC is constrained dynamic correlation modeling, which explicitly parameterizes the common static constraints as a spatial/temporal vanilla adjacency matrix shared by all frames/joints and dynamically extracts correspondence variances for each frame/joint with an adjustment modeling function. For each sample, the common constrained adjacency matrices are fixed to represent generic motion patterns, while the extracted variances complete the matrices with specific pattern adjustments. Meanwhile, we mathematically reformulate GCs on spatiotemporal graphs into a unified form and find that DSTD-GC relaxes certain constraints of other GC, which contributes to a better representation capability. By combining DSTD-GC with prior knowledge, we propose a powerful spatiotemporal GCN called DSTD-GCN, which outperforms SOTA methods by $3.9\% \sim 8.7\%$ in prediction accuracy with $55.0\% \sim 96.9\%$ fewer parameters.