用于视觉对象跟踪的暹罗关键点预测网络

论文标题

用于视觉对象跟踪的暹罗关键点预测网络

Siamese Keypoint Prediction Network for Visual Object Tracking

论文作者

Li, Qiang, Qin, Zekui, Zhang, Wenbo, Zheng, Wen

论文摘要

视觉对象跟踪旨在估计鉴于其初始边界框，在视频序列中任意目标的位置。通过利用离线功能学习，Siamese范式最近成为高性能跟踪的领先框架。但是，当前现有的暹罗跟踪器在很大程度上依赖于复杂的基于锚的检测网络，或者缺乏抵抗干扰物的能力。在本文中，我们建议暹罗关键点预测网络（Siamkpn）解决这些挑战。在用于特征嵌入的暹罗主链上，暹罗Pn受益于喀斯喀特热图策略，用于粗到细节预测建模。特别是，通过依次缩小沿级联的标签热图的覆盖范围以应用零件到分裂的中间监督来实现该策略。在推断期间，我们发现将连续阶段的预测热图逐渐集中到目标上，并减少到干扰物。 SiamkPN在以实时速度运行时，在四个基准数据集上对最新的跟踪器进行了针对的最新跟踪器，可在四个基准数据集上进行视觉对象跟踪。

Visual object tracking aims to estimate the location of an arbitrary target in a video sequence given its initial bounding box. By utilizing offline feature learning, the siamese paradigm has recently been the leading framework for high performance tracking. However, current existing siamese trackers either heavily rely on complicated anchor-based detection networks or lack the ability to resist to distractors. In this paper, we propose the Siamese keypoint prediction network (SiamKPN) to address these challenges. Upon a Siamese backbone for feature embedding, SiamKPN benefits from a cascade heatmap strategy for coarse-to-fine prediction modeling. In particular, the strategy is implemented by sequentially shrinking the coverage of the label heatmap along the cascade to apply loose-to-strict intermediate supervisions. During inference, we find the predicted heatmaps of successive stages to be gradually concentrated to the target and reduced to the distractors. SiamKPN performs well against state-of-the-art trackers for visual object tracking on four benchmark datasets including OTB-100, VOT2018, LaSOT and GOT-10k, while running at real-time speed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题