frih：细粒度的区域感知图像协调

论文标题

frih：细粒度的区域感知图像协调

FRIH: Fine-grained Region-aware Image Harmonization

论文作者

Peng, Jinlong, Luo, Zekun, Liu, Liang, Zhang, Boshen, Wang, Tao, Wang, Yabiao, Tai, Ying, Wang, Chengjie, Lin, Weiyao

论文摘要

图像统一旨在为复合图像生成更现实的前景外观和背景。现有方法为整个前景执行相同的协调过程。但是，植入的前景总是包含不同的外观模式。所有现有的解决方案都忽略了每个颜色块的差异，而丢失了一些具体的细节。因此，我们提出了一个新型的全球本地两个阶段框架，用于细粒度的区域感知图像协调（FRIH），该框架是经过训练的端到端。在第一阶段，整个输入前景面罩用于使整体的粗粒粒度协调。在第二阶段，我们通过复合图像中的相应像素RGB值将输入前景掩码自适应地聚集到了几个子掩码中。每个子宫和粗糙调整的图像分别是串联的，并馈入轻质级联模块，根据区域感知的本地特征调整全局协调性能。此外，我们通过将所有级联解码器层的特征融合在一起以生成最终结果，进一步设计了一个融合预测模块，该模块可以全面利用不同程度的协调结果。如果没有铃铛和哨声，我们的FRIH算法就可以在IHARMONY4数据集（PSNR为38.19 dB）上实现最佳性能。我们模型的参数仅为11.98 m，远低于现有方法。

Image harmonization aims to generate a more realistic appearance of foreground and background for a composite image. Existing methods perform the same harmonization process for the whole foreground. However, the implanted foreground always contains different appearance patterns. All the existing solutions ignore the difference of each color block and losing some specific details. Therefore, we propose a novel global-local two stages framework for Fine-grained Region-aware Image Harmonization (FRIH), which is trained end-to-end. In the first stage, the whole input foreground mask is used to make a global coarse-grained harmonization. In the second stage, we adaptively cluster the input foreground mask into several submasks by the corresponding pixel RGB values in the composite image. Each submask and the coarsely adjusted image are concatenated respectively and fed into a lightweight cascaded module, adjusting the global harmonization performance according to the region-aware local feature. Moreover, we further designed a fusion prediction module by fusing features from all the cascaded decoder layers together to generate the final result, which could utilize the different degrees of harmonization results comprehensively. Without bells and whistles, our FRIH algorithm achieves the best performance on iHarmony4 dataset (PSNR is 38.19 dB) with a lightweight model. The parameters for our model are only 11.98 M, far below the existing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题