论文标题
基于费用延伸理论的面向任务的图像语义通信
Task-Oriented Image Semantic Communication Based on Rate-Distortion Theory
论文作者
论文摘要
面向任务的图像语义通信是一种新的通信范式,旨在传输人工智能(AI)任务的语义,同时忽略图像的重建质量。但是,在某些应用中,例如自主驾驶,必须同时考虑图像重建质量和随后的AI任务的性能。为了应对这一挑战,本文提出了一种面向任务的语义通信方案(TOSC-SR)。它的主要目标是同时最大程度地减少像素级和任务相关的语义级变形,这在一定速率下,这制定了新的利率降低优化问题。为了成功地衡量语义级别的损失,提出了一种新形式的语义失真形式,该语义失真是通过语义重新构造的图像和任务标签之间的相互信息测量的。然后,我们为法式问题得出了一个分析解决方案,其中获得了问题的自洽方程,以确定源和语义重建图像的最佳映射。为了实现TOSC-SR,我们基于相互信息的变异近似,进一步获得了速度分数形式的扩展形式,该信息适用于多个AI任务。实验结果表明,所提出的方法优于传统的JPEG,JPEG2000,BPG,基于VVC的图像通信系统以及基于图像重建质量,AI任务性能和多任务概括能力的基于深度学习的基准。
Task-oriented image semantic communication is a new communication paradigm, which aims to transmit semantics for artificial intelligent (AI) tasks while ignoring the reconstruction quality of the images. However, in some applications, such as autonomous driving, both image reconstruction quality and the performance of the followed AI tasks must be simultaneously considered. To tackle this challenge, this paper proposes a task-oriented semantic communication scheme with semantic reconstruction (TOSC-SR). Its main goal is to simultaneously minimize pixel-level and task-relevant semantic-level distortion during communications under a certain rate, which formulates a new rate-distortion optimization problem. To successfully measure the loss at the semantic level, a new form of semantic distortion measured by the mutual information between the semantic-reconstructed images and the task labels is proposed. Then, we derive an analytical solution for the formulated problem, where the self-consistent equations of the problem are obtained to determine the optimal mapping of the source and the semantic-reconstructed images. To implement TOSC-SR, we further obtain an extended form of rate-distortion form based on the variational approximation of mutual information, which is applicable to multiple AI tasks. Experimental results show that the proposed approach outperforms the traditional JPEG, JPEG2000, BPG, VVC-based image communication systems and deep learning based benchmarks in terms of image reconstruction quality, AI task performance, and multi-task generalization ability.