论文标题

弱监督的面对对称性对比损失的命名

Weakly Supervised Face Naming with Symmetry-Enhanced Contrastive Loss

论文作者

Qu, Tingyu, Tuytelaars, Tinne, Moens, Marie-Francine

论文摘要

我们重新审视了弱监督的跨模式面名的校准任务;也就是说,给定图像和标题,我们将图像中的面部标记为标题中的名称。尽管过去的方法通过对一组图像及其各自的字幕进行不确定性推理来了解名称和面部之间的潜在一致性,但在本文中,我们依靠适当的损失功能来学习神经网络设置中的对齐方式,并提出Secla和secla-b。 Secla是一种基于对比度的基于学习的对准模型,可以有效地以弱监督的方式有效地最大化相应面孔和名称之间的相似性分数。模型的变体Secla-B学会了像人类一样对齐名称和面孔,也就是说,从易于到硬案例中学习,以进一步提高secla的性能。更具体地说,Secla-B应用了两个阶段的学习框架:(1)在每个图像符号对中的几个名称和面上的易于子集中训练该模型。 (2)使用自举策略利用轻松案例中的已知名称和面孔以及额外的损失,以防止忘记和学习新的对齐方式。我们为野生数据集中的增强标记面和名人数据集取得了最新的结果。此外,我们认为我们的方法可以适应其他多模式新闻理解任务。

We revisit the weakly supervised cross-modal face-name alignment task; that is, given an image and a caption, we label the faces in the image with the names occurring in the caption. Whereas past approaches have learned the latent alignment between names and faces by uncertainty reasoning over a set of images and their respective captions, in this paper, we rely on appropriate loss functions to learn the alignments in a neural network setting and propose SECLA and SECLA-B. SECLA is a Symmetry-Enhanced Contrastive Learning-based Alignment model that can effectively maximize the similarity scores between corresponding faces and names in a weakly supervised fashion. A variation of the model, SECLA-B, learns to align names and faces as humans do, that is, learning from easy to hard cases to further increase the performance of SECLA. More specifically, SECLA-B applies a two-stage learning framework: (1) Training the model on an easy subset with a few names and faces in each image-caption pair. (2) Leveraging the known pairs of names and faces from the easy cases using a bootstrapping strategy with additional loss to prevent forgetting and learning new alignments at the same time. We achieve state-of-the-art results for both the augmented Labeled Faces in the Wild dataset and the Celebrity Together dataset. In addition, we believe that our methods can be adapted to other multimodal news understanding tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源