论文标题
3D手姿势估计的有效注释和学习:一项调查
Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey
论文作者
论文摘要
在这项调查中,我们从有效的注释和学习的角度进行了对3D手姿势估计的系统评价。 3D手姿势估计是一个重要的研究领域,因为它有可能实现各种应用,例如视频理解,AR/VR和机器人技术。但是,模型的性能与带注释的3D手姿势的质量和数量相关。在现状下,获得这种注释的3D手姿势是具有挑战性的,例如,由于3D注释的难度和闭塞的存在。为了揭示此问题,我们回顾了分类为手动,基于合成模型的,基于手感传感器和计算方法的现有注释方法的利弊。此外,我们研究了当注释数据稀缺时学习3D手摆姿势的方法,包括自我监督的预读,半监督学习和域的适应性。基于对有效注释和学习的研究,我们进一步讨论了该领域的局限性以及可能的未来方向。
In this survey, we present a systematic review of 3D hand pose estimation from the perspective of efficient annotation and learning. 3D hand pose estimation has been an important research area owing to its potential to enable various applications, such as video understanding, AR/VR, and robotics. However, the performance of models is tied to the quality and quantity of annotated 3D hand poses. Under the status quo, acquiring such annotated 3D hand poses is challenging, e.g., due to the difficulty of 3D annotation and the presence of occlusion. To reveal this problem, we review the pros and cons of existing annotation methods classified as manual, synthetic-model-based, hand-sensor-based, and computational approaches. Additionally, we examine methods for learning 3D hand poses when annotated data are scarce, including self-supervised pretraining, semi-supervised learning, and domain adaptation. Based on the study of efficient annotation and learning, we further discuss limitations and possible future directions in this field.