stylegan-xl：缩放样式到大型不同数据集

论文标题

stylegan-xl：缩放样式到大型不同数据集

StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets

论文作者

Sauer, Axel, Schwarz, Katja, Geiger, Andreas

论文摘要

计算机图形学经历了最近以数据为中心的方法来创建的，以数据为中心的方法。尤其是stylegan设置了有关图像质量和可控性的生成建模的新标准。但是，StyleGan的性能在大型非结构化数据集（例如ImageNet）上严重降低。 Stylegan是为可控性而设计的。因此，先前的作品怀疑其限制性设计不适合各种数据集。相比之下，我们发现主要的限制因素是当前的培训策略。在最近引入的预计GAN范式之后，我们利用强大的神经网络先验和渐进式增长策略来成功培训Imagenet上最新的StyleGAN Generator。我们的最终型号StyleGAN-XL设置了大规模图像合成的新最先进的模型，并且是第一个以这种数据集量表以$ 1024^2 $的分辨率生成图像的最先进的图像。我们证明，该模型可以将肖像或特定对象类别狭窄领域的图像反转和编辑。

Computer graphics has experienced a recent surge of data-centric approaches for photorealistic and controllable content creation. StyleGAN in particular sets new standards for generative modeling regarding image quality and controllability. However, StyleGAN's performance severely degrades on large unstructured datasets such as ImageNet. StyleGAN was designed for controllability; hence, prior works suspect its restrictive design to be unsuitable for diverse datasets. In contrast, we find the main limiting factor to be the current training strategy. Following the recently introduced Projected GAN paradigm, we leverage powerful neural network priors and a progressive growing strategy to successfully train the latest StyleGAN3 generator on ImageNet. Our final model, StyleGAN-XL, sets a new state-of-the-art on large-scale image synthesis and is the first to generate images at a resolution of $1024^2$ at such a dataset scale. We demonstrate that this model can invert and edit images beyond the narrow domain of portraits or specific object classes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题