论文标题

人类对象互动识别中被忽视的分类器

The Overlooked Classifier in Human-Object Interaction Recognition

论文作者

Jin, Ying, Chen, Yinpeng, Wang, Lijuan, Wang, Jianfeng, Yu, Pei, Liang, Lin, Hwang, Jenq-Neng, Liu, Zicheng

论文摘要

人类对象相互作用(HOI)识别由于两个因素而具有挑战性:(1)跨类别的显着失衡和(2)每个图像需要多个标签。本文表明,可以通过未经骨干架构改善分类器来有效地解决这两个挑战。首先,我们通过用HOIS的语言嵌入来初始化了类中的类别之间的语义相关性。结果,性能显着提高,尤其是对于少量射击子集。其次,我们提出了一个名为LSE-Sign的新损失,以增强长尾数据集上的多标签学习。我们简单而有效的方法可以实现无检测的HOI分类,超过了需要对象检测和人类姿势的最先进的方法。此外,我们通过将分类模型与现成的对象检测器连接到实例级HOI检测。我们在没有其他微调的情况下实现了最新的。

Human-Object Interaction (HOI) recognition is challenging due to two factors: (1) significant imbalance across classes and (2) requiring multiple labels per image. This paper shows that these two challenges can be effectively addressed by improving the classifier with the backbone architecture untouched. Firstly, we encode the semantic correlation among classes into the classification head by initializing the weights with language embeddings of HOIs. As a result, the performance is boosted significantly, especially for the few-shot subset. Secondly, we propose a new loss named LSE-Sign to enhance multi-label learning on a long-tailed dataset. Our simple yet effective method enables detection-free HOI classification, outperforming the state-of-the-arts that require object detection and human pose by a clear margin. Moreover, we transfer the classification model to instance-level HOI detection by connecting it with an off-the-shelf object detector. We achieve state-of-the-art without additional fine-tuning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源