论文标题
通过3D场景接地从以自我为中心视频的4D人体捕获
4D Human Body Capture from Egocentric Video via 3D Scene Grounding
论文作者
论文摘要
我们介绍了一项新的任务,该任务是重建来自单眼自负视频的第二人物3D人体的时间序列。以自我为中心视频的独特观点和快速体现的相机运动为人体捕获带来了其他技术障碍。为了应对这些挑战,我们提出了一种简单而有效的基于优化的方法,该方法利用了对整个视频序列和人类场景相互作用的2D观察,限制了估计第二人称人类姿势,形状和全球运动,这些姿势,形状和全球运动基于从egpentric视图中捕获的3D环境。我们进行详细的消融研究以验证我们的设计选择。此外,我们将我们的方法与以前从单眼视频捕获人类运动捕获的最新方法进行了比较,并表明我们的方法估计了在充满挑战的以egipentric的环境下更准确的人体姿势和形状。此外,我们证明我们的方法会产生更现实的人类习惯相互作用。
We introduce a novel task of reconstructing a time series of second-person 3D human body meshes from monocular egocentric videos. The unique viewpoint and rapid embodied camera motion of egocentric videos raise additional technical barriers for human body capture. To address those challenges, we propose a simple yet effective optimization-based approach that leverages 2D observations of the entire video sequence and human-scene interaction constraint to estimate second-person human poses, shapes, and global motion that are grounded on the 3D environment captured from the egocentric view. We conduct detailed ablation studies to validate our design choice. Moreover, we compare our method with the previous state-of-the-art method on human motion capture from monocular video, and show that our method estimates more accurate human-body poses and shapes under the challenging egocentric setting. In addition, we demonstrate that our approach produces more realistic human-scene interaction.