基于原型的手势识别的基于原型的广义零拍学习框架

论文标题

基于原型的手势识别的基于原型的广义零拍学习框架

A Prototype-Based Generalized Zero-Shot Learning Framework for Hand Gesture Recognition

论文作者

Wu, Jinting, Zhang, Yujia, Zhao, Xiaoguang

论文摘要

手势识别在人类计算机的相互作用中起着重要的作用，可以理解各种人类手势及其意图。但是，大多数先前的作品只能识别有限标记的类的手势，并且无法适应新类别。对手势识别的广义零射击学习（GZSL）的任务旨在通过利用语义表示并检测出可见的和看不见的类样本来解决上述问题。在本文中，我们提出了一个基于端到端原型的GZSL框架，用于手势识别，由两个分支组成。第一个分支是一个基于原型的检测器，它可以学习手势表示，并确定输入样本是否属于可见的或看不见的类别。第二个分支是一个零拍的标签预测指标，它将看不见的类作为输入的特征，并通过在功能和语义空间之间的学习映射机制来输出预测。我们进一步建立了一个专门针对此GZSL任务的手势数据集，该数据集的全面实验证明了我们提出的方法在识别可见和看不见的手势方面的有效性。

Hand gesture recognition plays a significant role in human-computer interaction for understanding various human gestures and their intent. However, most prior works can only recognize gestures of limited labeled classes and fail to adapt to new categories. The task of Generalized Zero-Shot Learning (GZSL) for hand gesture recognition aims to address the above issue by leveraging semantic representations and detecting both seen and unseen class samples. In this paper, we propose an end-to-end prototype-based GZSL framework for hand gesture recognition which consists of two branches. The first branch is a prototype-based detector that learns gesture representations and determines whether an input sample belongs to a seen or unseen category. The second branch is a zero-shot label predictor which takes the features of unseen classes as input and outputs predictions through a learned mapping mechanism between the feature and the semantic space. We further establish a hand gesture dataset that specifically targets this GZSL task, and comprehensive experiments on this dataset demonstrate the effectiveness of our proposed approach on recognizing both seen and unseen gestures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题