通过元学习和基于变压器的关系建模的跨受试者动作单元检测

论文标题

通过元学习和基于变压器的关系建模的跨受试者动作单元检测

Cross-subject Action Unit Detection with Meta Learning and Transformer-based Relation Modeling

论文作者

Cao, Jiyuan, Liu, Zhilei, Zhang, Yong

论文摘要

面部动作单元（AU）检测是面部运动的情绪分析的至关重要任务。不同受试者的明显差异有时会误导AUS带来的变化，从而导致结果不准确。但是，基于深度学习的大多数现有AU检测方法都没有考虑不同受试者的身份信息。本文提出了一个基于元学习的跨受试者AU检测模型，以消除身份引起的差异。此外，引入了基于变压器的关系学习模块，以了解多个AUS的潜在关系。具体来说，我们建议的工作由两个子任务组成。第一个子任务是基于元学习的AU局部区域表示学习，称为MARL，它学习了本地AU区域的歧视性表示，该区域结合了多个主题的共享信息并消除了身份引起的差异。第二个子任务使用第一个子任务的局部区域表示作为输入，然后根据变压器编码器体系结构添加关系学习以捕获AU关系。整个培训过程都是级联的。消融研究和可视化表明，我们的MAL可以消除身份引起的差异，从而获得强大而广义的AU判别嵌入表示。我们的结果证明，在两个公共数据集BP4D和DISFA上，我们的方法优于最先进的技术，而F1分别提高了1.3％和1.4％。

Facial Action Unit (AU) detection is a crucial task for emotion analysis from facial movements. The apparent differences of different subjects sometimes mislead changes brought by AUs, resulting in inaccurate results. However, most of the existing AU detection methods based on deep learning didn't consider the identity information of different subjects. The paper proposes a meta-learning-based cross-subject AU detection model to eliminate the identity-caused differences. Besides, a transformer-based relation learning module is introduced to learn the latent relations of multiple AUs. To be specific, our proposed work is composed of two sub-tasks. The first sub-task is meta-learning-based AU local region representation learning, called MARL, which learns discriminative representation of local AU regions that incorporates the shared information of multiple subjects and eliminates identity-caused differences. The second sub-task uses the local region representation of AU of the first sub-task as input, then adds relationship learning based on the transformer encoder architecture to capture AU relationships. The entire training process is cascaded. Ablation study and visualization show that our MARL can eliminate identity-caused differences, thus obtaining a robust and generalized AU discriminative embedding representation. Our results prove that on the two public datasets BP4D and DISFA, our method is superior to the state-of-the-art technology, and the F1 score is improved by 1.3% and 1.4%, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题