学习运动依赖性外观，用于从单个相机中对动态人类的高保真渲染

论文标题

学习运动依赖性外观，用于从单个相机中对动态人类的高保真渲染

Learning Motion-Dependent Appearance for High-Fidelity Rendering of Dynamic Humans from a Single Camera

论文作者

Yoon, Jae Shin, Ceylan, Duygu, Wang, Tuanfeng Y., Lu, Jingwan, Yang, Jimei, Shu, Zhixin, Park, Hyun Soo

论文摘要

穿着人类的外观经历了复杂的几何变换，不仅是由静态姿势引起的，而且还通过其动力学引起的，即，存在许多布几何构型，具体取决于姿势的姿势。在现有的人类渲染方法中，以运动为条件的这种外观建模在很大程度上被忽略了，从而导致了物理上难以置信的运动。学习外观动态的主要挑战在于需要大量观察。在本文中，我们通过执行均衡性提出了紧凑的运动表示形式 - 预计表示姿势转换的方式会转换。我们对一个均衡的编码器进行建模，该编码器可以从3D身体表面的空间和时间衍生物中产生可推广的表示。该学识渊博的表示形式是由一个组成多任务解码器解码，该组成的多任务解码器具有高保真度时间变化的外观。我们的实验表明，我们的方法可以生成一个动态人类的时间连贯的视频，以看见单个视频视频。

Appearance of dressed humans undergoes a complex geometric transformation induced not only by the static pose but also by its dynamics, i.e., there exists a number of cloth geometric configurations given a pose depending on the way it has moved. Such appearance modeling conditioned on motion has been largely neglected in existing human rendering methods, resulting in rendering of physically implausible motion. A key challenge of learning the dynamics of the appearance lies in the requirement of a prohibitively large amount of observations. In this paper, we present a compact motion representation by enforcing equivariance -- a representation is expected to be transformed in the way that the pose is transformed. We model an equivariant encoder that can generate the generalizable representation from the spatial and temporal derivatives of the 3D body surface. This learned representation is decoded by a compositional multi-task decoder that renders high fidelity time-varying appearance. Our experiments show that our method can generate a temporally coherent video of dynamic humans for unseen body poses and novel views given a single view video.

下载PDF全文

下载文献需遵守相关版权规定

论文标题