Video2Stylegan：在视频中解开本地和全球变化

论文标题

Video2Stylegan：在视频中解开本地和全球变化

Video2StyleGAN: Disentangling Local and Global Variations in a Video

论文作者

Abdal, Rameen, Zhu, Peihao, Mitra, Niloy J., Wonka, Peter

论文摘要

使用预告片的样式发电机进行了图像编辑已成为面部编辑的强大范式，提供了超过年龄，表达，照明等的分离控制措施。但是，该方法不能直接用于视频操作。我们假设主要缺失的成分是缺乏对面部位置，面部姿势和局部面部表情的细粒度和分离的控制。在这项工作中，我们证明了通过跨多个（潜在）空间（即位置空间，W+空间和S空间）工作并结合多个空间中的优化结果，可以使用预验证的Stylegan来实现这种细粒度的控制。在此启用组件的基础上，我们介绍了Video2stylegan，该Stylegan采用目标图像和驾驶视频，以在目标图像的身份中重新制定驾驶视频中的本地和全球位置和表达式。我们评估了我们方法在多种挑战性场景中的有效性，并证明了对替代方法的明显改进。

Image editing using a pretrained StyleGAN generator has emerged as a powerful paradigm for facial editing, providing disentangled controls over age, expression, illumination, etc. However, the approach cannot be directly adopted for video manipulations. We hypothesize that the main missing ingredient is the lack of fine-grained and disentangled control over face location, face pose, and local facial expressions. In this work, we demonstrate that such a fine-grained control is indeed achievable using pretrained StyleGAN by working across multiple (latent) spaces (namely, the positional space, the W+ space, and the S space) and combining the optimization results across the multiple spaces. Building on this enabling component, we introduce Video2StyleGAN that takes a target image and driving video(s) to reenact the local and global locations and expressions from the driving video in the identity of the target image. We evaluate the effectiveness of our method over multiple challenging scenarios and demonstrate clear improvements over alternative approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题