野外精确3D人姿势和形状估计的合成训练

论文标题

野外精确3D人姿势和形状估计的合成训练

Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild

论文作者

Sengupta, Akash, Budvytis, Ignas, Cipolla, Roberto

论文摘要

本文解决了单眼3D人形状的问题，并从RGB图像中提出了姿势估计。尽管在姿势预测准确性方面在该领域取得了长足的进步，但最先进的方法通常可以预测身体形状不正确。我们建议这主要是由于缺乏具有多样和准确的身体形状标签的野外训练数据。因此，我们提出了表带（用于真正准确的姿势和形状的合成训练），该系统利用代理表示，例如轮廓和2D关节，作为形状和姿势回归神经网络的输入，该输入是经过合成训练数据（在使用SMPL统计身体模型的训练过程中生成的训练）来训练的。我们通过使用培训期间的数据增强和腐败在测试时间时通过KePoint检测和分割CNN来弥合合成训练输入和嘈杂的真实输入之间的差距。为了评估我们的方法，我们策划并提供了一个充满挑战的评估数据集，用于单眼人类形状估计，运动形状和姿势3D（SSP-3D）。它由带有各种身体形状的紧密穿衣运动杆子的RGB图像和相应的伪沿smpl Smpl形状和姿势参数组成，可通过多帧优化获得。我们表明，在形状预测准确性方面，皮带在SSP-3D上的其他最先进方法优于其他最先进的方法，同时与以姿势中心的数据集和指标保持竞争力。

This paper addresses the problem of monocular 3D human shape and pose estimation from an RGB image. Despite great progress in this field in terms of pose prediction accuracy, state-of-the-art methods often predict inaccurate body shapes. We suggest that this is primarily due to the scarcity of in-the-wild training data with diverse and accurate body shape labels. Thus, we propose STRAPS (Synthetic Training for Real Accurate Pose and Shape), a system that utilises proxy representations, such as silhouettes and 2D joints, as inputs to a shape and pose regression neural network, which is trained with synthetic training data (generated on-the-fly during training using the SMPL statistical body model) to overcome data scarcity. We bridge the gap between synthetic training inputs and noisy real inputs, which are predicted by keypoint detection and segmentation CNNs at test-time, by using data augmentation and corruption during training. In order to evaluate our approach, we curate and provide a challenging evaluation dataset for monocular human shape estimation, Sports Shape and Pose 3D (SSP-3D). It consists of RGB images of tightly-clothed sports-persons with a variety of body shapes and corresponding pseudo-ground-truth SMPL shape and pose parameters, obtained via multi-frame optimisation. We show that STRAPS outperforms other state-of-the-art methods on SSP-3D in terms of shape prediction accuracy, while remaining competitive with the state-of-the-art on pose-centric datasets and metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题