基于多样性的概括，用于无监督的文本分类

论文标题

基于多样性的概括，用于无监督的文本分类

Diversity-Based Generalization for Unsupervised Text Classification under Domain Shift

论文作者

Krishnan, Jitin, Purohit, Hemant, Rangwala, Huzefa

论文摘要

域适应方法寻求从源域中学习并将其推广到看不见的目标域。目前，主观文本分类问题的最新无监督域适应方法利用未标记的目标数据以及标记的源数据。在本文中，我们提出了一种基于简单但有效的基于多样性的概括的概念的单任务文本分类问题的域适应性方法，该方法不需要未标记的目标数据，但仍然与性能的最先进相匹配。多样性通过强迫模型不依靠相同的特征进行预测来促进模型更好地概括和不加选择地向域转移。我们将此概念应用于神经网络（注意力层）的最可解释的组成部分。为了产生足够的多样性，我们创建了一个多头注意模型，并在注意力头之间注入多样性的约束，使每个人的学习方式都不同。我们通过三训练和设计一个程序在三训练分类器的注意力头之间具有附加多样性约束的过程，从而进一步扩展了模型。使用亚马逊评论的标准基准数据集和新建造的危机事件数据集进行了广泛的评估表明，我们完全无监督的方法与使用未标记的目标数据的竞争基线相匹配。我们的结果表明，确保多样性的机器学习体系结构可以更好地推广；鼓励未来的研究在不使用未标记的目标数据的情况下设计无处不在的学习模型。

Domain adaptation approaches seek to learn from a source domain and generalize it to an unseen target domain. At present, the state-of-the-art unsupervised domain adaptation approaches for subjective text classification problems leverage unlabeled target data along with labeled source data. In this paper, we propose a novel method for domain adaptation of single-task text classification problems based on a simple but effective idea of diversity-based generalization that does not require unlabeled target data but still matches the state-of-the-art in performance. Diversity plays the role of promoting the model to better generalize and be indiscriminate towards domain shift by forcing the model not to rely on same features for prediction. We apply this concept on the most explainable component of neural networks, the attention layer. To generate sufficient diversity, we create a multi-head attention model and infuse a diversity constraint between the attention heads such that each head will learn differently. We further expand upon our model by tri-training and designing a procedure with an additional diversity constraint between the attention heads of the tri-trained classifiers. Extensive evaluation using the standard benchmark dataset of Amazon reviews and a newly constructed dataset of Crisis events shows that our fully unsupervised method matches with the competing baselines that uses unlabeled target data. Our results demonstrate that machine learning architectures that ensure sufficient diversity can generalize better; encouraging future research to design ubiquitously usable learning models without using unlabeled target data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题