生成推理集成标签噪声强大的深层图像表示学习

论文标题

生成推理集成标签噪声强大的深层图像表示学习

Generative Reasoning Integrated Label Noise Robust Deep Image Representation Learning

论文作者

Sumbul, Gencer, Demir, Begüm

论文摘要

基于深度学习的图像表示学习（IRL）方法的发展引起了人们对各种图像理解问题的极大关注。这些方法中的大多数都需要大量和质量的带注释的培训图像，这可能是耗时且昂贵的。为了降低标签成本，可以考虑众包数据，自动标签程序或公民科学项目。但是，这种方法增加了在训练数据中加入标签噪声的风险。当采用歧视性推理时，它可能会导致嘈杂的标签过度拟合。这导致了次优的学习程序，因此图像的表征不准确。为了解决这个问题，我们介绍了一种生成推理的集成标签噪声强大的深度表示学习（网格）方法。我们的方法旨在模拟IRL在嘈杂标签下对IRL的歧视性和生成推理的互补特征。为此，我们首先通过监督的变异自动编码器将生成推理整合到歧视性推理中。这允许网格自动检测具有嘈杂标签的训练样品。然后，通过我们的标签噪声稳健的混合表示策略，网格通过生成推理和其他样本通过区分推理来调整这些样本的IRL的整个学习过程。我们的方法学习判别图像表示，同时防止与所选择的IRL方法独立地对嘈杂标签的干扰。因此，与现有方法不同，网格不取决于注释，神经网络架构，损失功能或学习任务的类型，因此可以直接用于各种问题。实验结果表明，与最新方法相比，其有效性。网格代码可在https://github.com/gencersumbul/grid上公开获得。

The development of deep learning based image representation learning (IRL) methods has attracted great attention for various image understanding problems. Most of these methods require the availability of a high quantity and quality of annotated training images, which can be time-consuming and costly to gather. To reduce labeling costs, crowdsourced data, automatic labeling procedures or citizen science projects can be considered. However, such approaches increase the risk of including label noise in training data. It may result in overfitting on noisy labels when discriminative reasoning is employed. This leads to sub-optimal learning procedures, and thus inaccurate characterization of images. To address this, we introduce a generative reasoning integrated label noise robust deep representation learning (GRID) approach. Our approach aims to model the complementary characteristics of discriminative and generative reasoning for IRL under noisy labels. To this end, we first integrate generative reasoning into discriminative reasoning through a supervised variational autoencoder. This allows GRID to automatically detect training samples with noisy labels. Then, through our label noise robust hybrid representation learning strategy, GRID adjusts the whole learning procedure for IRL of these samples through generative reasoning and that of other samples through discriminative reasoning. Our approach learns discriminative image representations while preventing interference of noisy labels independently from the IRL method being selected. Thus, unlike the existing methods, GRID does not depend on the type of annotation, neural network architecture, loss function or learning task, and thus can be directly utilized for various problems. Experimental results show its effectiveness compared to state-of-the-art methods. The code of GRID is publicly available at https://github.com/gencersumbul/GRID.

下载PDF全文

下载文献需遵守相关版权规定

论文标题