MEVID：多视图扩展视频，具有视频人重新识别的身份

论文标题

MEVID：多视图扩展视频，具有视频人重新识别的身份

MEVID: Multi-view Extended Videos with Identities for Video Person Re-Identification

论文作者

Davila, Daniel, Du, Dawei, Lewis, Bryon, Funk, Christopher, Van Pelt, Joseph, Collins, Roderick, Corona, Kellie, Brown, Matt, McCloskey, Scott, Hoogs, Anthony, Clipp, Brian

论文摘要

在本文中，我们介绍了具有身份（MEVID）数据集的多视图扩展视频，用于野外大规模的视频人重新识别（REID）。据我们所知，Mevid代表了最变化的视频人REID数据集，在73天的窗口中，跨越了九个独特的日期，各种相机的观点和实体服装的变化，跨越了室内和室外环境。具体来说，我们标记了158个独特的人的身份，穿着598套服装，从8、092曲目，平均长度约为590帧，从非常大规模的MEVA人活动数据集中可以看到33个摄像头。尽管其他数据集具有更独特的身份，但Mevid强调了有关每个人的一组更丰富的信息，例如：4个服装/身份与CCVID中的2种服装/身份，MTA的5个模拟位置的33位观点，MTA的5个位置，LS-VID的1000万帧与300万帧。基于MEVA视频数据集，我们还继承了在人口统计学上有意平衡美国大陆的数据。为了加速注释过程，我们开发了一个半自动注释框架和GUI，该框架和GUI结合了最新的实时模型，用于对象检测，姿势估计，REID和多对象跟踪。我们评估了几种有关MEVID挑战问题的最先进方法，并在服装，规模和背景位置的变化方面全面量化了它们的鲁棒性。我们对MEVID现实，独特方面的定量分析表明，视频人REID剩下剩下的挑战，并指示了未来研究的重要方向。

In this paper, we present the Multi-view Extended Videos with Identities (MEVID) dataset for large-scale, video person re-identification (ReID) in the wild. To our knowledge, MEVID represents the most-varied video person ReID dataset, spanning an extensive indoor and outdoor environment across nine unique dates in a 73-day window, various camera viewpoints, and entity clothing changes. Specifically, we label the identities of 158 unique people wearing 598 outfits taken from 8, 092 tracklets, average length of about 590 frames, seen in 33 camera views from the very large-scale MEVA person activities dataset. While other datasets have more unique identities, MEVID emphasizes a richer set of information about each individual, such as: 4 outfits/identity vs. 2 outfits/identity in CCVID, 33 viewpoints across 17 locations vs. 6 in 5 simulated locations for MTA, and 10 million frames vs. 3 million for LS-VID. Being based on the MEVA video dataset, we also inherit data that is intentionally demographically balanced to the continental United States. To accelerate the annotation process, we developed a semi-automatic annotation framework and GUI that combines state-of-the-art real-time models for object detection, pose estimation, person ReID, and multi-object tracking. We evaluate several state-of-the-art methods on MEVID challenge problems and comprehensively quantify their robustness in terms of changes of outfit, scale, and background location. Our quantitative analysis on the realistic, unique aspects of MEVID shows that there are significant remaining challenges in video person ReID and indicates important directions for future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题