自引导扩散模型

论文标题

自引导扩散模型

Self-Guided Diffusion Models

论文作者

Hu, Vincent Tao, Zhang, David W, Asano, Yuki M., Burghouts, Gertjan J., Snoek, Cees G. M.

论文摘要

扩散模型在图像产生质量方面表现出了显着的进展，尤其是在使用指导来控制生成过程的情况下。但是，指导需要大量的图像批准对进行培训，因此取决于其可用性，正确性和无偏见。在本文中，我们通过利用自学信号的灵活性来设计一个自引导扩散模型的框架来消除对这种注释的需求。通过利用特征提取函数和自称功能，我们的方法在各种图像粒度上提供指导信号：从整体图像的级别到对象框，甚至分割掩码。我们对单标签和多标签图像数据集进行的实验表明，自标记的指导始终优于无指导的扩散模型，甚至可能基于基于地面真实标签，尤其是在不平衡数据上超越指导。当配备了自我监督的框或蒙版建议时，我们的方法进一步生成视觉上多样化但语义上一致的图像，而无需任何类，盒子或段标签注释。自引导的扩散是简单，灵活的，并且期望从大规模部署中获利。源代码将在：https：//taohu.me/sgdm/

Diffusion models have demonstrated remarkable progress in image generation quality, especially when guidance is used to control the generative process. However, guidance requires a large amount of image-annotation pairs for training and is thus dependent on their availability, correctness and unbiasedness. In this paper, we eliminate the need for such annotation by instead leveraging the flexibility of self-supervision signals to design a framework for self-guided diffusion models. By leveraging a feature extraction function and a self-annotation function, our method provides guidance signals at various image granularities: from the level of holistic images to object boxes and even segmentation masks. Our experiments on single-label and multi-label image datasets demonstrate that self-labeled guidance always outperforms diffusion models without guidance and may even surpass guidance based on ground-truth labels, especially on unbalanced data. When equipped with self-supervised box or mask proposals, our method further generates visually diverse yet semantically consistent images, without the need for any class, box, or segment label annotation. Self-guided diffusion is simple, flexible and expected to profit from deployment at scale. Source code will be at: https://taohu.me/sgdm/

下载PDF全文

下载文献需遵守相关版权规定

论文标题