通过Segvae的可控图像合成

论文标题

通过Segvae的可控图像合成

Controllable Image Synthesis via SegVAE

论文作者

Cheng, Yen-Chi, Lee, Hsin-Ying, Sun, Min, Yang, Ming-Hsuan

论文摘要

灵活的用户控件对于内容创建和图像编辑是可取的。语义图是有条件图像生成的常用中间表示。与RAW RGB像素上的操作相比，语义映射可实现更简单的用户修改。在这项工作中，我们专门针对的是生成语义图的标签集，该标签集由所需类别组成。所提出的框架Segvae使用条件变异自动编码器以迭代方式合成语义图。定量和定性实验表明，所提出的模型可以生成现实和多样化的语义图。我们还应用现成的图像到图像翻译模型来生成逼真的RGB图像，以更好地了解合成的语义图的质量。此外，我们还展示了几个现实世界图像编辑的应用程序，包括对象删除，对象插入和对象更换。

Flexible user controls are desirable for content creation and image editing. A semantic map is commonly used intermediate representation for conditional image generation. Compared to the operation on raw RGB pixels, the semantic map enables simpler user modification. In this work, we specifically target at generating semantic maps given a label-set consisting of desired categories. The proposed framework, SegVAE, synthesizes semantic maps in an iterative manner using conditional variational autoencoder. Quantitative and qualitative experiments demonstrate that the proposed model can generate realistic and diverse semantic maps. We also apply an off-the-shelf image-to-image translation model to generate realistic RGB images to better understand the quality of the synthesized semantic maps. Furthermore, we showcase several real-world image-editing applications including object removal, object insertion, and object replacement.

下载PDF全文

下载文献需遵守相关版权规定

论文标题