论文标题
3D-FM GAN:朝3D控制的面部操纵
3D-FM GAN: Towards 3D-Controllable Face Manipulation
论文作者
论文摘要
由于生成对抗网络(GAN)的突破,3D可控制的肖像综合已取得了显着提高。但是,用精确的3D控制操纵现有的面部图像仍然具有挑战性。虽然串联gan倒置和3D感知,但噪声到图像gan是一种直接的解决方案,但效率低下,可能导致编辑质量的明显下降。为了填补这一空白,我们提出了3D-FM GAN,这是一个专门为3D可控制的面部操作设计的新型有条件GAN框架,并且在端到端学习阶段后不需要任何调整。通过仔细编码输入面图像和3D编辑的基于物理的渲染,我们的图像生成器提供了高质量,可保存的,可控制的3D可控制的面部操纵。为了有效地学习这种新颖的框架,我们制定了两种基本的培训策略和一种新颖的乘法共调整结构,可在天真的方案上显着改善。通过广泛的评估,我们表明我们的方法在各种任务上都优于先前的艺术,具有更好的编辑性,更强的身份保存和更高的照片真实性。此外,我们在大型姿势编辑和室外图像上展示了我们的设计更好的推广性。
3D-controllable portrait synthesis has significantly advanced, thanks to breakthroughs in generative adversarial networks (GANs). However, it is still challenging to manipulate existing face images with precise 3D control. While concatenating GAN inversion and a 3D-aware, noise-to-image GAN is a straight-forward solution, it is inefficient and may lead to noticeable drop in editing quality. To fill this gap, we propose 3D-FM GAN, a novel conditional GAN framework designed specifically for 3D-controllable face manipulation, and does not require any tuning after the end-to-end learning phase. By carefully encoding both the input face image and a physically-based rendering of 3D edits into a StyleGAN's latent spaces, our image generator provides high-quality, identity-preserved, 3D-controllable face manipulation. To effectively learn such novel framework, we develop two essential training strategies and a novel multiplicative co-modulation architecture that improves significantly upon naive schemes. With extensive evaluations, we show that our method outperforms the prior arts on various tasks, with better editability, stronger identity preservation, and higher photo-realism. In addition, we demonstrate a better generalizability of our design on large pose editing and out-of-domain images.