重建多标签图像分类的重建深度学习

论文标题

重建多标签图像分类的重建深度学习

Reconstruction Regularized Deep Metric Learning for Multi-label Image Classification

论文作者

Li, Changsheng, Liu, Chong, Duan, Lixin, Gao, Peng, Zheng, Kai

论文摘要

在本文中，我们提出了一种新颖的深度度量学习方法，以解决多标签图像分类问题。为了更好地了解图像特征和标签之间的相关性，我们试图探索一个潜在空间，其中图像和标签分别通过两个独特的深神经网络嵌入了图像和标签。为了捕获图像特征和标签之间的关系，我们旨在学习一个\ emph {tix-fay}深距离指标，从两个不同的视图（即一个图像及其标签之间的距离之间的距离）上嵌入了空间，不仅小于图像及其唱片公司最近的邻居之间的距离，而且还小于标签与其他图像相关的邻居之间的距离，而且还小于与该标签相关的邻居之间的距离。此外，将用于恢复正确标签的重建模块作为正规化项合并到整个框架中，因此标签嵌入空间更具代表性。我们的模型可以以端到端的方式进行培训。与最先进的图像相比，公开图像数据集的实验结果证实了我们方法的功效。

In this paper, we present a novel deep metric learning method to tackle the multi-label image classification problem. In order to better learn the correlations among images features, as well as labels, we attempt to explore a latent space, where images and labels are embedded via two unique deep neural networks, respectively. To capture the relationships between image features and labels, we aim to learn a \emph{two-way} deep distance metric over the embedding space from two different views, i.e., the distance between one image and its labels is not only smaller than those distances between the image and its labels' nearest neighbors, but also smaller than the distances between the labels and other images corresponding to the labels' nearest neighbors. Moreover, a reconstruction module for recovering correct labels is incorporated into the whole framework as a regularization term, such that the label embedding space is more representative. Our model can be trained in an end-to-end manner. Experimental results on publicly available image datasets corroborate the efficacy of our method compared with the state-of-the-arts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题