所有错误都不相等：全面的层次结构意识到多标签预测（冠军）

论文标题

所有错误都不相等：全面的层次结构意识到多标签预测（冠军）

All Mistakes Are Not Equal: Comprehensive Hierarchy Aware Multi-label Predictions (CHAMP)

论文作者

Vaswani, Ashwin, Aggarwal, Gaurav, Netrapalli, Praneeth, Hegde, Narayan G

论文摘要

本文考虑了分层多标签分类（HMC）的问题，其中（i）每个示例都可以存在几个标签，并且（ii）标签通过特定于域特异性的层次结构相关。在直觉的指导下，所有错误都不相等，我们提出了全面的层次结构意识到多标签预测（Champ），该框架会根据其严重性根据层次结构树惩罚错误预测。据我们所知，有一些作品将这种想法应用于单标签分类，但对于多标签分类的限制，侧重于错误的严重性。关键原因是没有明确的方法可以在多标签设置中先验地量化错误预测的严重性。在这项工作中，我们提出了一个简单但有效的指标，以量化HMC中错误的严重性，自然会导致冠军。在跨六个公共HMC数据集跨模态（图像，音频和文本）上进行的广泛实验表明，合并层次结构信息会带来可观的增长，因为Champ提高了AUPRC（2.6％的中间百分比改善）和层次指标（2.85％的中位数改善百分比（2.85％）（超过2.85％），超过了独立的层次或多数级别的分类方法。与标准的多标记基线相比，Champ在鲁棒性（平均提高百分比8.87％）和数据制度更少的情况下提供了改善的AUPRC。此外，我们的方法提供了一个框架来增强具有更好错误的现有多标签分类算法（平均百分比增量为18.1％）。

This paper considers the problem of Hierarchical Multi-Label Classification (HMC), where (i) several labels can be present for each example, and (ii) labels are related via a domain-specific hierarchy tree. Guided by the intuition that all mistakes are not equal, we present Comprehensive Hierarchy Aware Multi-label Predictions (CHAMP), a framework that penalizes a misprediction depending on its severity as per the hierarchy tree. While there have been works that apply such an idea to single-label classification, to the best of our knowledge, there are limited such works for multilabel classification focusing on the severity of mistakes. The key reason is that there is no clear way of quantifying the severity of a misprediction a priori in the multilabel setting. In this work, we propose a simple but effective metric to quantify the severity of a mistake in HMC, naturally leading to CHAMP. Extensive experiments on six public HMC datasets across modalities (image, audio, and text) demonstrate that incorporating hierarchical information leads to substantial gains as CHAMP improves both AUPRC (2.6% median percentage improvement) and hierarchical metrics (2.85% median percentage improvement), over stand-alone hierarchical or multilabel classification methods. Compared to standard multilabel baselines, CHAMP provides improved AUPRC in both robustness (8.87% mean percentage improvement ) and less data regimes. Further, our method provides a framework to enhance existing multilabel classification algorithms with better mistakes (18.1% mean percentage increment).

下载PDF全文

下载文献需遵守相关版权规定

论文标题