论文标题

瑞利特征方向(红色):多维特征的gan潜在空间遍历

Rayleigh EigenDirections (REDs): GAN latent space traversals for multidimensional features

论文作者

Balakrishnan, Guha, Gadde, Raghudeep, Martinez, Aleix, Perona, Pietro

论文摘要

我们提出了一种在深层生成模型的潜在空间中查找路径的方法,该空间可以最大程度地改变一组图像特征,同时保持其他图像特征。至关重要的是,与过去的遍历方法不同,我们可以操纵图像的多维特征,例如面部身份和指定区域内的像素。我们的方法是原则性的,在概念上是简单的:最佳的遍历方向是通过最大化对一个特征集的差异更改来选择的,以使对另一组的更改可以忽略不计。我们表明,这个问题几乎等同于瑞利商的最大化之一,并基于求解广义特征值方程提供了封闭形式的解决方案。我们使用相应的最佳方向的重复计算,我们称之为瑞利特定方向(REDS)来生成潜在空间中适当弯曲的路径。我们在两个图像域上使用stylegan2进行经验评估我们的方法:面孔和客厅。我们表明,我们的方法能够从以前的潜在空间遍历方法的范围中控制各种多维特征:面部身份,空间频带,区域内的像素以及对象的外观和位置。我们的工作表明,很多机会在于对潜在空间的几何形状和语义的本地分析。

We present a method for finding paths in a deep generative model's latent space that can maximally vary one set of image features while holding others constant. Crucially, unlike past traversal approaches, ours can manipulate multidimensional features of an image such as facial identity and pixels within a specified region. Our method is principled and conceptually simple: optimal traversal directions are chosen by maximizing differential changes to one feature set such that changes to another set are negligible. We show that this problem is nearly equivalent to one of Rayleigh quotient maximization, and provide a closed-form solution to it based on solving a generalized eigenvalue equation. We use repeated computations of the corresponding optimal directions, which we call Rayleigh EigenDirections (REDs), to generate appropriately curved paths in latent space. We empirically evaluate our method using StyleGAN2 on two image domains: faces and living rooms. We show that our method is capable of controlling various multidimensional features out of the scope of previous latent space traversal methods: face identity, spatial frequency bands, pixels within a region, and the appearance and position of an object. Our work suggests that a wealth of opportunities lies in the local analysis of the geometry and semantics of latent spaces.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源