从2D姿势和3D运动的协同重建，用于野外的宽空间多人视频运动捕获

论文标题

从2D姿势和3D运动的协同重建，用于野外的宽空间多人视频运动捕获

Synergetic Reconstruction from 2D Pose and 3D Motion for Wide-Space Multi-Person Video Motion Capture in the Wild

论文作者

Ohashi, Takuya, Ikegami, Yosuke, Nakamura, Yoshihiko

论文摘要

尽管许多研究都研究了无标记的运动捕获，但该技术尚未应用于真正的体育或音乐会。在本文中，我们提出了一种无标记的运动捕获方法，具有空间和多人环境中多个相机的时空精度和平滑度。所提出的方法可以预测每个人的3D姿势，并确定多相机图像的边界框，足够小。基于人类骨骼模型的这种预测和时空滤波可实现人的3D重建，并证明了高临界性。然后，使用准确的3D重建来预测下一帧中每个相机图像的边界框。这是从3D运动到2D姿势的反馈，对视频运动捕获的整体性能产生了协同效果。我们使用各种数据集和一个真实的运动场评估了提出的方法。实验结果表明，每关节位置误差（MPJPE）的平均值为31.5 mm，正确零件（PCP）的百分比为99.5％，同时动态移动，同时满足运动范围（ROM）。视频演示，数据集和其他材料都发布在我们的项目页面上。

Although many studies have investigated markerless motion capture, the technology has not been applied to real sports or concerts. In this paper, we propose a markerless motion capture method with spatiotemporal accuracy and smoothness from multiple cameras in wide-space and multi-person environments. The proposed method predicts each person's 3D pose and determines the bounding box of multi-camera images small enough. This prediction and spatiotemporal filtering based on human skeletal model enables 3D reconstruction of the person and demonstrates high-accuracy. The accurate 3D reconstruction is then used to predict the bounding box of each camera image in the next frame. This is feedback from the 3D motion to 2D pose, and provides a synergetic effect on the overall performance of video motion capture. We evaluated the proposed method using various datasets and a real sports field. The experimental results demonstrate that the mean per joint position error (MPJPE) is 31.5 mm and the percentage of correct parts (PCP) is 99.5% for five people dynamically moving while satisfying the range of motion (RoM). Video demonstration, datasets, and additional materials are posted on our project page.

下载PDF全文

下载文献需遵守相关版权规定

论文标题