多视图人体网状翻译器

论文标题

多视图人体网状翻译器

Multi-view Human Body Mesh Translator

论文作者

Jiang, Xiangjian, Nie, Xuecheng, Wang, Zitian, Liu, Luoqi, Liu, Si

论文摘要

现有的人网恢复方法主要集中在单视框架上，但是由于设置不足，它们通常无法产生准确的结果。考虑到多视图运动捕获系统的成熟度，在本文中，我们建议通过利用来自不同视图的多个图像来解决先前的问题，从而显着提高了回收网格的质量。特别是，我们提出了一个小说\ textbf {m} Ulti-viled人体\ textbf {m} esh \ textbf {t} ranslator（mmt）模型，用于借助视觉变压器估计人体网格。具体而言，MMT将多视图图像作为输入，并以单一的方式将其转换为目标网格。 MMT在编码阶段和解码阶段中都融合了不同视图的功能，从而导致嵌入全局信息的表示形式。此外，为了确保令牌的重点集中在人的姿势和形状上，MMT通过将3D关键点位置投影到每个视图并在几何限制中执行其一致性，从而在特征级别进行跨视图对齐。全面的实验表明，MMT的表现优于现有的单个或多视图模型，这是人类网格恢复任务的巨大余量，特别是MPVE的28.8 \％改进比最具挑战性的Humbi数据集的当前最新方法。定性评估还验证了MMT在重建高质量的人网中的有效性。接受代码将在接受后提供。

Existing methods for human mesh recovery mainly focus on single-view frameworks, but they often fail to produce accurate results due to the ill-posed setup. Considering the maturity of the multi-view motion capture system, in this paper, we propose to solve the prior ill-posed problem by leveraging multiple images from different views, thus significantly enhancing the quality of recovered meshes. In particular, we present a novel \textbf{M}ulti-view human body \textbf{M}esh \textbf{T}ranslator (MMT) model for estimating human body mesh with the help of vision transformer. Specifically, MMT takes multi-view images as input and translates them to targeted meshes in a single-forward manner. MMT fuses features of different views in both encoding and decoding phases, leading to representations embedded with global information. Additionally, to ensure the tokens are intensively focused on the human pose and shape, MMT conducts cross-view alignment at the feature level by projecting 3D keypoint positions to each view and enforcing their consistency in geometry constraints. Comprehensive experiments demonstrate that MMT outperforms existing single or multi-view models by a large margin for human mesh recovery task, notably, 28.8\% improvement in MPVE over the current state-of-the-art method on the challenging HUMBI dataset. Qualitative evaluation also verifies the effectiveness of MMT in reconstructing high-quality human mesh. Codes will be made available upon acceptance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题