APB2FACE：带有辅助姿势和眨眼信号的音频引导的面部重演

论文标题

APB2FACE：带有辅助姿势和眨眼信号的音频引导的面部重演

APB2Face: Audio-guided face reenactment with auxiliary pose and blink signals

论文作者

Zhang, Jiangning, Liu, Liang, Xue, Zhucun, Liu, Yong

论文摘要

音频指导的面部重演旨在使用音频信息来产生逼真的面孔，同时保持与与真实人交谈时相同的面部运动。但是，现有方法无法生成生动的面部图像，也不能仅重新启用低分辨率面孔，从而限制了应用程序值。为了解决这些问题，我们提出了一个名为apb2face的新型深神经网络，该网络由几何形式和位置模块组成。几何形式使用额外的头部姿势和眨眼状态信号以及音频来预测潜在的地标几何信息，而FonacteNactor则输入了面部标志性的图像以重新体现镜面的面孔。提出了从YouTube收集的新数据集ANNVI以支持该方法，实验结果表明，无论是在真实性还是可控性方面，您的方法都比最先进的方法具有优势。

Audio-guided face reenactment aims at generating photorealistic faces using audio information while maintaining the same facial movement as when speaking to a real person. However, existing methods can not generate vivid face images or only reenact low-resolution faces, which limits the application value. To solve those problems, we propose a novel deep neural network named APB2Face, which consists of GeometryPredictor and FaceReenactor modules. GeometryPredictor uses extra head pose and blink state signals as well as audio to predict the latent landmark geometry information, while FaceReenactor inputs the face landmark image to reenact the photorealistic face. A new dataset AnnVI collected from YouTube is presented to support the approach, and experimental results indicate the superiority of our method than state-of-the-arts, whether in authenticity or controllability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题