带有复发性神经网络的3D Kinect数据的位置和旋转不变的手语识别

论文标题

带有复发性神经网络的3D Kinect数据的位置和旋转不变的手语识别

Position and Rotation Invariant Sign Language Recognition from 3D Kinect Data with Recurrent Neural Networks

论文作者

Roy, Prasun, Bhattacharya, Saumik, Roy, Partha Pratim, Pal, Umapada

论文摘要

手语是语音和听力受损的人的基于手势的符号沟通媒介。它也充当非受损和受损人群之间的通信桥梁。不幸的是，在大多数情况下，不受影响的人在这种象征性语言中限制了这两个类别之间的自然信息流。因此，无缝将手语翻译成自然语言的自动翻译机制可能具有很高的优势。在本文中，我们试图对30个基本的印度标志手势进行认可。手势表示为3D地图（RGB +深度）的时间序列，每个序列由Kinect传感器捕获的20个身体接头的3D坐标组成。复发性神经网络（RNN）被用作分类器。为了提高分类器的性能，我们使用几何变换来对齐深度框架。在我们的实验中，该模型的精度达到84.81％。

Sign language is a gesture-based symbolic communication medium among speech and hearing impaired people. It also serves as a communication bridge between non-impaired and impaired populations. Unfortunately, in most situations, a non-impaired person is not well conversant in such symbolic languages restricting the natural information flow between these two categories. Therefore, an automated translation mechanism that seamlessly translates sign language into natural language can be highly advantageous. In this paper, we attempt to perform recognition of 30 basic Indian sign gestures. Gestures are represented as temporal sequences of 3D maps (RGB + depth), each consisting of 3D coordinates of 20 body joints captured by the Kinect sensor. A recurrent neural network (RNN) is employed as the classifier. To improve the classifier's performance, we use geometric transformation for the alignment correction of depth frames. In our experiments, the model achieves 84.81% accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题