论文标题

JPD-SE:图像压缩中关节感知延伸的高级语义

JPD-SE: High-Level Semantics for Joint Perception-Distortion Enhancement in Image Compression

论文作者

Duan, Shiyu, Chen, Huaijin, Gu, Jinwei

论文摘要

尽管人类可以通过利用对内容的高级理解的传统或最新学习的图像压缩编解码器来毫不费力地将复杂的视觉场景转变为简单的单词,而反过来,似乎并没有利用视觉内容的语义含义来充分发挥其潜力。此外,它们主要集中在速度延伸上,并且在感知质量方面的表现往往不佳,尤其是在低比特率方案中,并且常常无视下游计算机视觉算法的性能,这是一个快速增长的压缩图像消费者组,除了人类观众外。在本文中,我们(1)提出了一个通用框架,该框架可以使任何图像编解码器能够利用高级语义,(2)研究感知质量和失真的关节优化。我们的想法是,鉴于任何编解码器,我们利用高级语义来增强其提取的低级视觉特征,并产生基本上新的语义意识的编解码器。我们提出了一个三相训练计划,该方案教授语义意识的编解码器来利用语义的力量来共同优化速率 - 感知渗透率(R-PD)的性能。作为另一个好处,语义感知的编解码器还提高了下游计算机视觉算法的性能。为了验证我们的主张,我们进行了广泛的经验评估,并提供定量和定性结果。

While humans can effortlessly transform complex visual scenes into simple words and the other way around by leveraging their high-level understanding of the content, conventional or the more recent learned image compression codecs do not seem to utilize the semantic meanings of visual content to their full potential. Moreover, they focus mostly on rate-distortion and tend to underperform in perception quality especially in low bitrate regime, and often disregard the performance of downstream computer vision algorithms, which is a fast-growing consumer group of compressed images in addition to human viewers. In this paper, we (1) present a generic framework that can enable any image codec to leverage high-level semantics and (2) study the joint optimization of perception quality and distortion. Our idea is that given any codec, we utilize high-level semantics to augment the low-level visual features extracted by it and produce essentially a new, semantic-aware codec. We propose a three-phase training scheme that teaches semantic-aware codecs to leverage the power of semantic to jointly optimize rate-perception-distortion (R-PD) performance. As an additional benefit, semantic-aware codecs also boost the performance of downstream computer vision algorithms. To validate our claim, we perform extensive empirical evaluations and provide both quantitative and qualitative results.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源