论文标题
SLGTFORMER:基于注意力语言识别的方法
SLGTformer: An Attention-Based Approach to Sign Language Recognition
论文作者
论文摘要
手语是聋人或静音者交流的首选方法,但与任何语言类似,对于那些难以听见或无法说话的人来说,很难学习,并且代表着一个重要的障碍。一个人的整个额叶外观决定并传达了特定的含义。但是,这种额叶外观可以被量化为人体姿势的时间序列,通过学习骨骼关键点的时空动力学,从而导致手语识别。我们提出了一种基于注意力语言识别的新型,基于注意力的方法,专门构建在脱钩图和时间自我注意的基础上:手语图表时间变压器(SLGTFORMER)。 SLGTFormer首先将时空姿势序列分别解构为空间图和颞窗。然后,SlgTformer利用新颖的可学习图相对位置编码(LGRPE)来指导人类骨骼的图形邻域环境。通过将时间维度建模为内部和窗口间动力学,我们将时间双胞胎自我注意事项(TTSA)作为局部分组的时间关注(LTA)和全局子采样的时间注意(GSTA)的组合。我们证明了SlgTformer对世界级的American手语(WLASL)数据集的有效性,从而在关键点模式上采用无合成的方法来实现最先进的性能。该代码可在https://github.com/neilsong/slt上获得
Sign language is the preferred method of communication of deaf or mute people, but similar to any language, it is difficult to learn and represents a significant barrier for those who are hard of hearing or unable to speak. A person's entire frontal appearance dictates and conveys specific meaning. However, this frontal appearance can be quantified as a temporal sequence of human body pose, leading to Sign Language Recognition through the learning of spatiotemporal dynamics of skeleton keypoints. We propose a novel, attention-based approach to Sign Language Recognition exclusively built upon decoupled graph and temporal self-attention: the Sign Language Graph Time Transformer (SLGTformer). SLGTformer first deconstructs spatiotemporal pose sequences separately into spatial graphs and temporal windows. SLGTformer then leverages novel Learnable Graph Relative Positional Encodings (LGRPE) to guide spatial self-attention with the graph neighborhood context of the human skeleton. By modeling the temporal dimension as intra- and inter-window dynamics, we introduce Temporal Twin Self-Attention (TTSA) as the combination of locally-grouped temporal attention (LTA) and global sub-sampled temporal attention (GSTA). We demonstrate the effectiveness of SLGTformer on the World-Level American Sign Language (WLASL) dataset, achieving state-of-the-art performance with an ensemble-free approach on the keypoint modality. The code is available at https://github.com/neilsong/slt