cascadexml：重新思考变形金刚进行极端多标签分类的端到端多分辨率培训

论文标题

cascadexml：重新思考变形金刚进行极端多标签分类的端到端多分辨率培训

CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification

论文作者

Kharbanda, Siddhant, Banerjee, Atmadeep, Schultheis, Erik, Babbar, Rohit

论文摘要

极端的多标签文本分类（XMC）涉及学习一个分类器，该分类器可以分配具有数百万个标签选择的最相关标签的输入。最近的方法（例如XR-Transformer和LightXML）利用了变压器实例来实现最先进的性能。但是，在此过程中，这些方法需要在性能和计算要求之间进行各种权衡。与基于BI-LSTM的ActivusXML相比，一个主要的缺点是，它们无法保留标签树中每个分辨率的单独特征表示。因此，我们提出了CascadeXML，Cascadexml是一种端到端的多分辨率学习管道，它可以利用变压器模型的多层体系结构，用于参与具有单独功能表示的不同标签分辨率。 Cascadexml在基准数据集中获得的非平凡收益明显胜过所有现有方法，该数据集由多达300万个标签组成。 Cascadexml的代码将在\ url {https://github.com/xmc-aalto/cascadexml}上公开获得。

Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices. Recent approaches, such as XR-Transformer and LightXML, leverage a transformer instance to achieve state-of-the-art performance. However, in this process, these approaches need to make various trade-offs between performance and computational requirements. A major shortcoming, as compared to the Bi-LSTM based AttentionXML, is that they fail to keep separate feature representations for each resolution in a label tree. We thus propose CascadeXML, an end-to-end multi-resolution learning pipeline, which can harness the multi-layered architecture of a transformer model for attending to different label resolutions with separate feature representations. CascadeXML significantly outperforms all existing approaches with non-trivial gains obtained on benchmark datasets consisting of up to three million labels. Code for CascadeXML will be made publicly available at \url{https://github.com/xmc-aalto/cascadexml}.

下载PDF全文

下载文献需遵守相关版权规定

论文标题