论文标题
超越3DMM:学会捕获高保真3D脸型
Beyond 3DMM: Learning to Capture High-fidelity 3D Face Shape
论文作者
论文摘要
3D形态模型(3DMM)拟合因其强大的3D先验而受益匪浅。但是,由于细粒的几何形状的损失,先前重建的3D面部面孔遭受视觉效果的降解,这归因于地面真相3D形状不足,不可靠的训练策略和3DMM的有限代表能力。为了减轻此问题,本文提出了一个完整的解决方案来捕获个性化形状,以使重建的形状看起来与相应的人相同。具体而言,给定2D图像作为输入,我们实际上在几个校准的视图中渲染图像,以使姿势变化正常化,同时保留原始图像几何形状。多对一的沙漏网络是融合多视图功能的编码编码器,并将顶点位移作为细粒几何形状。此外,通过直接优化视觉效果来训练神经网络,其中通过测量从形状呈现的多视图图像之间的相似性来比较两个3D形状。最后,我们建议通过注册RGB-D图像,然后进行姿势和形状增强来生成地面3D形状,从而为网络培训提供足够的数据。关于几种具有挑战性的协议的实验证明了我们的提案对面部形状的优异重建精度。
3D Morphable Model (3DMM) fitting has widely benefited face analysis due to its strong 3D priori. However, previous reconstructed 3D faces suffer from degraded visual verisimilitude due to the loss of fine-grained geometry, which is attributed to insufficient ground-truth 3D shapes, unreliable training strategies and limited representation power of 3DMM. To alleviate this issue, this paper proposes a complete solution to capture the personalized shape so that the reconstructed shape looks identical to the corresponding person. Specifically, given a 2D image as the input, we virtually render the image in several calibrated views to normalize pose variations while preserving the original image geometry. A many-to-one hourglass network serves as the encode-decoder to fuse multiview features and generate vertex displacements as the fine-grained geometry. Besides, the neural network is trained by directly optimizing the visual effect, where two 3D shapes are compared by measuring the similarity between the multiview images rendered from the shapes. Finally, we propose to generate the ground-truth 3D shapes by registering RGB-D images followed by pose and shape augmentation, providing sufficient data for network training. Experiments on several challenging protocols demonstrate the superior reconstruction accuracy of our proposal on the face shape.