在广义零射门学习中进行偏置校准的门控模型

论文标题

在广义零射门学习中进行偏置校准的门控模型

A Gating Model for Bias Calibration in Generalized Zero-shot Learning

论文作者

Kwon, Gukyeong, AlRegib, Ghassan

论文摘要

广义的零射击学习（GZSL）旨在训练一个模型，该模型只能通过使用辅助信息来概括以看不见的类数据。 GZSL中的主要挑战之一是一个有偏见的模型预测，这些预测是在训练过程中仅适用于可用的类数据而导致的类别。为了克服这个问题，我们为GZSL提出了一个基于两流动的自动编码器的门控模型。我们的门控模型可以预测查询数据是来自可见的类还是看不见的类，并利用独立的和看不见的专家来彼此独立地预测该类。该框架避免将所见类别的偏见预测分数与看不见的类别的预测分数进行比较。特别是，我们测量了潜在空间中的视觉和属性表示之间的距离与自动编码器的跨重建空间之间的距离。这些距离被用作互补特征，以表征不同级别的数据抽象的看不见类。同样，两流动员自动编码器是门控模型和看不见的专家的统一框架，这使得提出的方法在计算上有效。我们在四个基准图像识别数据集中验证我们提出的方法。与其他最先进的方法相比，我们在Sun和Awa2中实现了最佳的谐波平均准确性，而Cub和Awa1中的第二好的是。此外，与依赖生成模型的最新方法相比，我们的基本模型至少需要少20％的模型参数。

Generalized zero-shot learning (GZSL) aims at training a model that can generalize to unseen class data by only using auxiliary information. One of the main challenges in GZSL is a biased model prediction toward seen classes caused by overfitting on only available seen class data during training. To overcome this issue, we propose a two-stream autoencoder-based gating model for GZSL. Our gating model predicts whether the query data is from seen classes or unseen classes, and utilizes separate seen and unseen experts to predict the class independently from each other. This framework avoids comparing the biased prediction scores for seen classes with the prediction scores for unseen classes. In particular, we measure the distance between visual and attribute representations in the latent space and the cross-reconstruction space of the autoencoder. These distances are utilized as complementary features to characterize unseen classes at different levels of data abstraction. Also, the two-stream autoencoder works as a unified framework for the gating model and the unseen expert, which makes the proposed method computationally efficient. We validate our proposed method in four benchmark image recognition datasets. In comparison with other state-of-the-art methods, we achieve the best harmonic mean accuracy in SUN and AWA2, and the second best in CUB and AWA1. Furthermore, our base model requires at least 20% less number of model parameters than state-of-the-art methods relying on generative models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题