迈向流畅的视频构图

论文标题

迈向流畅的视频构图

Towards Smooth Video Composition

论文作者

Zhang, Qihang, Yang, Ceyuan, Shen, Yujun, Xu, Yinghao, Zhou, Bolei

论文摘要

视频生成需要随着时间的推移而与动态内容合成一致和持续的帧。这项工作调查了使用生成的对抗网络（GAN）（gans）来编写任意长度的时间关系的时间关系。首先，为了构图相邻的框架，我们表明，单图像生成的无别名操作以及充分的预测知识，带来了平稳的框架过渡，而不会损害人均质量。其次，通过合并最初是为视频理解而设计的时间移动模块（TSM），我们设法促进了生成器综合更一致的动态。第三，我们开发了一种新型的基于B-Spline的运动表示，以确保时间平滑度以实现无限长度的视频产生。它可以超越训练中使用的帧号。还提出了一个低级的时间调制，以减轻长期视频生成的重复内容。我们在各种数据集上评估了我们的方法，并显示了对视频生成基线的实质性改进。代码和模型将在https://genforce.github.io/stylesv上公开获得。

Video generation requires synthesizing consistent and persistent frames with dynamic content over time. This work investigates modeling the temporal relations for composing video with arbitrary length, from a few frames to even infinite, using generative adversarial networks (GANs). First, towards composing adjacent frames, we show that the alias-free operation for single image generation, together with adequately pre-learned knowledge, brings a smooth frame transition without compromising the per-frame quality. Second, by incorporating the temporal shift module (TSM), originally designed for video understanding, into the discriminator, we manage to advance the generator in synthesizing more consistent dynamics. Third, we develop a novel B-Spline based motion representation to ensure temporal smoothness to achieve infinite-length video generation. It can go beyond the frame number used in training. A low-rank temporal modulation is also proposed to alleviate repeating contents for long video generation. We evaluate our approach on various datasets and show substantial improvements over video generation baselines. Code and models will be publicly available at https://genforce.github.io/StyleSV.

下载PDF全文

下载文献需遵守相关版权规定

论文标题