重复使用歧视器编码：朝着无监督的图像到图像翻译

论文标题

重复使用歧视器编码：朝着无监督的图像到图像翻译

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

论文作者

Chen, Runfa, Huang, Wenbing, Huang, Binghui, Sun, Fuchun, Fang, Bin

论文摘要

无监督的图像到图像翻译是计算机视觉中的核心任务。培训过程完成后，当前的翻译框架将放弃歧视者。本文通过重复使用歧视者来编码目标域的图像，从而辩称了歧视者的新作用。所提出的架构被称为Nice-Gan，在以前的方法上表现出了两种有利的模式：首先，由于不需要独立的编码组件，因此更紧凑；其次，该插入式编码器受对手损失的直接训练，如果应用多尺度鉴别器，则可以更有效地进行信息，并更有效地训练。 Nice-Gan的主要问题是翻译与编码器歧视的耦合，当我们通过GAN玩Min-Max游戏时，这可能会导致训练不一致。为了解决这个问题，我们制定了一种脱钩的训练策略，仅在最大化对手损失的同时将编码器进行培训，同时否则又一次冻结。在四个流行的基准测试上进行了广泛的实验表明，从FID，KID和人类的偏好方面，尼斯的表现优于最先进的方法。还进行了全面的消融研究以隔离每个提出的组件的有效性。我们的代码可在https://github.com/alpc91/nice-gan-pytorch上找到。

Unsupervised image-to-image translation is a central task in computer vision. Current translation frameworks will abandon the discriminator once the training process is completed. This paper contends a novel role of the discriminator by reusing it for encoding the images of the target domain. The proposed architecture, termed as NICE-GAN, exhibits two advantageous patterns over previous approaches: First, it is more compact since no independent encoding component is required; Second, this plug-in encoder is directly trained by the adversary loss, making it more informative and trained more effectively if a multi-scale discriminator is applied. The main issue in NICE-GAN is the coupling of translation with discrimination along the encoder, which could incur training inconsistency when we play the min-max game via GAN. To tackle this issue, we develop a decoupled training strategy by which the encoder is only trained when maximizing the adversary loss while keeping frozen otherwise. Extensive experiments on four popular benchmarks demonstrate the superior performance of NICE-GAN over state-of-the-art methods in terms of FID, KID, and also human preference. Comprehensive ablation studies are also carried out to isolate the validity of each proposed component. Our codes are available at https://github.com/alpc91/NICE-GAN-pytorch.

下载PDF全文

下载文献需遵守相关版权规定

论文标题