E-NERV：具有分离的时空环境的快速神经视频表示

论文标题

E-NERV：具有分离的时空环境的快速神经视频表示

E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context

论文作者

Li, Zizhang, Wang, Mengmeng, Pi, Huaijin, Xu, Kechun, Mei, Jianbiao, Liu, Yong

论文摘要

最近，与常规的像素内隐性表示相比，视频的图像隐式神经表示，其有希望的结果和迅速的速度因其有希望的结果和迅速的速度而受欢迎。但是，网络结构内的冗余参数在扩大理想性能时会导致大型模型大小。这种现象的关键原因是神经的耦合公式，它直接从框架索引输入中输出视频帧的空间和时间信息。在本文中，我们提出了E-NERV，它通过将图像的隐式神经代表分解为单独的空间和时间上下文，从而极大地加快了神经的速度。在这种新公式的指导下，我们的模型大大降低了冗余模型参数，同时保留表示能力。我们从实验上发现，我们的方法可以通过更少的参数改善性能，从而使收敛速度更快$ 8 \ times $。代码可在https://github.com/kyleleey/e-nerv上找到。

Recently, the image-wise implicit neural representation of videos, NeRV, has gained popularity for its promising results and swift speed compared to regular pixel-wise implicit representations. However, the redundant parameters within the network structure can cause a large model size when scaling up for desirable performance. The key reason of this phenomenon is the coupled formulation of NeRV, which outputs the spatial and temporal information of video frames directly from the frame index input. In this paper, we propose E-NeRV, which dramatically expedites NeRV by decomposing the image-wise implicit neural representation into separate spatial and temporal context. Under the guidance of this new formulation, our model greatly reduces the redundant model parameters, while retaining the representation ability. We experimentally find that our method can improve the performance to a large extent with fewer parameters, resulting in a more than $8\times$ faster speed on convergence. Code is available at https://github.com/kyleleey/E-NeRV.

下载PDF全文

下载文献需遵守相关版权规定

论文标题