提示多模式跟踪

论文标题

提示多模式跟踪

Prompting for Multi-Modal Tracking

论文作者

Yang, Jinyu, Li, Zhe, Zheng, Feng, Leonardis, Aleš, Song, Jingkuan

论文摘要

由于与传统的基于RGB的跟踪相比，多模式跟踪在复杂方案中具有更准确和健壮的能力，因此引起了人们的注意。它的关键在于如何融合多模式数据并减少模式之间的差距。但是，多模式跟踪仍然严重遭受数据缺乏症的影响，从而导致融合模块的学习不足。我们没有在本文中构建这样的融合模块，而是通过将重要性附加到多模式的视觉提示中，为多模式跟踪提供了新的视角。我们设计了一种新型的多模式及时跟踪器（Protrack），可以通过及时范式将多模式输入传递到单个模态。通过最佳使用预训练的RGB跟踪器的跟踪能力，我们的突起只能通过更改输入来实现高性能多模式跟踪，即使没有对多模式数据进行任何额外的培训。 5个基准数据集的广泛实验证明了所提出的突起的有效性。

Multi-modal tracking gains attention due to its ability to be more accurate and robust in complex scenarios compared to traditional RGB-based tracking. Its key lies in how to fuse multi-modal data and reduce the gap between modalities. However, multi-modal tracking still severely suffers from data deficiency, thus resulting in the insufficient learning of fusion modules. Instead of building such a fusion module, in this paper, we provide a new perspective on multi-modal tracking by attaching importance to the multi-modal visual prompts. We design a novel multi-modal prompt tracker (ProTrack), which can transfer the multi-modal inputs to a single modality by the prompt paradigm. By best employing the tracking ability of pre-trained RGB trackers learning at scale, our ProTrack can achieve high-performance multi-modal tracking by only altering the inputs, even without any extra training on multi-modal data. Extensive experiments on 5 benchmark datasets demonstrate the effectiveness of the proposed ProTrack.

下载PDF全文

下载文献需遵守相关版权规定

论文标题