论文标题
负面的信心 - 弱监督的二进制分类以有效审查有助于分类
Negative Confidence-Aware Weakly Supervised Binary Classification for Effective Review Helpfulness Classification
论文作者
论文摘要
积极标签的不完整和许多未标记实例的存在是二进制分类应用中的常见问题,例如在审查有帮助性分类中。分类文献中的各种研究都将所有未标记的实例视为负面例子。但是,一种分类模型,该模型学会用不完整的阳性标签对二进制实例进行分类,同时假设所有未标记的数据为负示例通常会产生偏见的分类器。在这项工作中,我们提出了一种新型的负面置信度弱监督的方法(NCW),该方法通过在分类器培训期间以不同的负面信心区分了不同的负面信心来定制二进制分类损失函数。我们使用评论有用分类作为检查NCWS方法有效性的测试案例。我们通过使用三个不同的数据集(即Yelp(场地评论))和Amazon(Kindle and Electronics Reviews)的两个数据集对NCW进行了彻底评估。我们的结果表明,NCW的表现优于文献中的强大基准,包括现有的基于SVM的方法(即SVM-P),基于阳性和未标记的基于学习的方法(即C-PU)和基于置信度的阳性方法(即P-CONF),以解决分类器的偏见问题。此外,我们通过在基于最先进的评论的场地推荐模型(即DeepConn)中使用其分类有用的评论,进一步研究了NCW的有效性,并证明了使用NCW相比,使用NCW来增强场地推荐有效性与层次相比。
The incompleteness of positive labels and the presence of many unlabelled instances are common problems in binary classification applications such as in review helpfulness classification. Various studies from the classification literature consider all unlabelled instances as negative examples. However, a classification model that learns to classify binary instances with incomplete positive labels while assuming all unlabelled data to be negative examples will often generate a biased classifier. In this work, we propose a novel Negative Confidence-aware Weakly Supervised approach (NCWS), which customises a binary classification loss function by discriminating the unlabelled examples with different negative confidences during the classifier's training. We use the review helpfulness classification as a test case for examining the effectiveness of our NCWS approach. We thoroughly evaluate NCWS by using three different datasets, namely one from Yelp (venue reviews), and two from Amazon (Kindle and Electronics reviews). Our results show that NCWS outperforms strong baselines from the literature including an existing SVM-based approach (i.e. SVM-P), the positive and unlabelled learning-based approach (i.e. C-PU) and the positive confidence-based approach (i.e. P-conf) in addressing the classifier's bias problem. Moreover, we further examine the effectiveness of NCWS by using its classified helpful reviews in a state-of-the-art review-based venue recommendation model (i.e. DeepCoNN) and demonstrate the benefits of using NCWS in enhancing venue recommendation effectiveness in comparison to the baselines.