论文标题
机器的视频编码:协作压缩和智能分析的范式
Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics
论文作者
论文摘要
视频编码旨在压缩和重建整个帧和功能压缩,仅保留和传输最关键的信息,位于量表的两端。也就是说,一个具有紧凑的和效率,可用于机器视觉,而另一个则充满忠诚,屈服于人类的感知。最近的视频压缩趋势的最新努力,例如基于深度学习的编码工具和端到端的图像/视频编码以及MPEG-7紧凑特征描述符标准,即用于视觉搜索的紧凑描述符和用于视频分析的紧凑描述符,分别促进自己的方向上的可持续和快速发展。在本文中,由于蓬勃发展的AI技术,例如预测和生成模型,我们在新领域进行探索,由新兴的MPEG标准化工作引起的机器视频编码(VCM)1。为了进行协作压缩和智能分析,VCM试图弥合机器视觉编码的功能编码与人类视觉编码的视频编码之间的差距。首先给出与上升分析的一致性分析,然后压缩实例数字视网膜,VCM的定义,配方和范式。同时,从MPEG标准化的独特角度来看,我们系统地回顾了视频压缩和特征压缩方面的最新技术,这提供了学术和工业证据,以实现在广泛的AI应用程序中对视频和功能流的协作压缩。最后,我们提出了潜在的VCM解决方案,初步结果证明了性能和效率提高。也讨论了进一步的方向。
Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full fidelity, bowing to human perception. The recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, i.e. Compact Descriptors for Visual Search and Compact Descriptors for Video Analysis, promote the sustainable and fast development in their own directions, respectively. In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG standardization efforts1. Towards collaborative compression and intelligent analytics, VCM attempts to bridge the gap between feature coding for machine vision and video coding for human vision. Aligning with the rising Analyze then Compress instance Digital Retina, the definition, formulation, and paradigm of VCM are given first. Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications. Finally, we come up with potential VCM solutions, and the preliminary results have demonstrated the performance and efficiency gains. Further direction is discussed as well.