论文标题
TCTRACK:航空跟踪的时间上下文
TCTrack: Temporal Contexts for Aerial Tracking
论文作者
论文摘要
连续帧之间的时间上下文远非现有的视觉跟踪器中充分利用。在这项工作中,我们提出了TCTRACK,这是一个综合框架,以充分利用时间环境进行航空跟踪。 \ textbf {两个级别}合并时间上下文:\ textbf {features}的提取和\ textbf {相似性映射}的细化。具体而言,对于特征提取,提出了在线暂时自适应卷积,以使用时间信息来增强空间特征,这是通过根据先前帧动态校准卷积重量来实现的。对于相似性图的细化,我们提出了一个自适应时间变压器,该变压器首先以记忆有效的方式有效地编码时间知识,然后再解码时间知识以准确地调整相似性图。 TCTRACK有效而有效:在四个空中跟踪基准测试中进行评估显示出令人印象深刻的性能;现实世界无人机测试显示,在Nvidia Jetson Agx Xavier上,其高速超过27 fps。
Temporal contexts among consecutive frames are far from being fully utilized in existing visual trackers. In this work, we present TCTrack, a comprehensive framework to fully exploit temporal contexts for aerial tracking. The temporal contexts are incorporated at \textbf{two levels}: the extraction of \textbf{features} and the refinement of \textbf{similarity maps}. Specifically, for feature extraction, an online temporally adaptive convolution is proposed to enhance the spatial features using temporal information, which is achieved by dynamically calibrating the convolution weights according to the previous frames. For similarity map refinement, we propose an adaptive temporal transformer, which first effectively encodes temporal knowledge in a memory-efficient way, before the temporal knowledge is decoded for accurate adjustment of the similarity map. TCTrack is effective and efficient: evaluation on four aerial tracking benchmarks shows its impressive performance; real-world UAV tests show its high speed of over 27 FPS on NVIDIA Jetson AGX Xavier.