论文标题
使用基于Swin Transformer的轻量级网络的单图超分辨率
Single Image Super-Resolution Using Lightweight Networks Based on Swin Transformer
论文作者
论文摘要
图像超分辨率重建是图像处理技术领域的重要任务,该任务可以将低分辨率图像恢复到具有高分辨率的高质量图像。近年来,深度学习已应用于图像超分辨率重建领域。随着深度神经网络的持续发展,重建图像的质量得到了极大的改善,但是模型的复杂性也已提高。在本文中,我们提出了基于Swin Transformer的两个轻巧模型,称为Mswinsr和Ugswinsr。 MSWINSR中最重要的结构称为多大小的SWIN变压器块(MSTB),该块主要包含四个平行的多头自我注意(MSA)块。 Ugswinsr将U-NET和GAN与Swin Transformer结合在一起。他们俩都可以降低模型的复杂性,但是Mswinsr可以达到更高的客观质量,而Ugswinsr可以达到更高的感知质量。 The experimental results demonstrate that MSwinSR increases PSNR by $\mathbf{0.07dB}$ compared with the state-of-the-art model SwinIR, while the number of parameters can reduced by $\mathbf{30.68\%}$, and the calculation cost can reduced by $\mathbf{9.936\%}$.与Swinir相比,Ugswinsr可以有效地减少网络的计算量,该网络可以减少$ \ Mathbf {90.92 \%} $。
Image super-resolution reconstruction is an important task in the field of image processing technology, which can restore low resolution image to high quality image with high resolution. In recent years, deep learning has been applied in the field of image super-resolution reconstruction. With the continuous development of deep neural network, the quality of the reconstructed images has been greatly improved, but the model complexity has also been increased. In this paper, we propose two lightweight models named as MSwinSR and UGSwinSR based on Swin Transformer. The most important structure in MSwinSR is called Multi-size Swin Transformer Block (MSTB), which mainly contains four parallel multi-head self-attention (MSA) blocks. UGSwinSR combines U-Net and GAN with Swin Transformer. Both of them can reduce the model complexity, but MSwinSR can reach a higher objective quality, while UGSwinSR can reach a higher perceptual quality. The experimental results demonstrate that MSwinSR increases PSNR by $\mathbf{0.07dB}$ compared with the state-of-the-art model SwinIR, while the number of parameters can reduced by $\mathbf{30.68\%}$, and the calculation cost can reduced by $\mathbf{9.936\%}$. UGSwinSR can effectively reduce the amount of calculation of the network, which can reduced by $\mathbf{90.92\%}$ compared with SwinIR.