论文标题

您的审稿人是否受到同等对待?发现亚组结构以改善垃圾邮件检测的公平性

Are Your Reviewers Being Treated Equally? Discovering Subgroup Structures to Improve Fairness in Spam Detection

论文作者

Liu, Jiaxin, Lyu, Yuefei, Zhang, Xi, Xie, Sihong

论文摘要

用户生成的产品评论是在线商务的重要资产,例如亚马逊和Yelp,而假评论则普遍存在误导客户。 GNN是一种最新方法,它通过利用连接审阅者,评论和目标产品的图形拓扑来检测可疑审阅者。但是,不同审稿人组的检测准确性的差异可以降低审阅者的参与和客户信任,并在评论网站上。与以前的信念认为,两组之间的差异会导致不公平性,我们研究了组内的亚组结构,这些结构也可能在治疗不同的群体中引起差异。本文解决了定义,近似和利用新的子组结构进行公平垃圾邮件检测的挑战。我们首先在审查图中确定了导致组中差异准确性的亚组结构。对审查图的复杂依赖性在嘲笑隐藏在较大组中的亚组方面造成了困难。我们设计了一个可以训练的模型,可以共同推断隐藏的亚组成员资格,并利用成员资格来校准各组的检测准确性。在三个大Yelp评论数据集上与基线的全面比较表明,可以确定和利用亚组成员资格以进行团体公平。

User-generated reviews of products are vital assets of online commerce, such as Amazon and Yelp, while fake reviews are prevalent to mislead customers. GNN is the state-of-the-art method that detects suspicious reviewers by exploiting the topologies of the graph connecting reviewers, reviews, and target products. However, the discrepancy in the detection accuracy over different groups of reviewers can degrade reviewer engagement and customer trust in the review websites. Unlike the previous belief that the difference between the groups causes unfairness, we study the subgroup structures within the groups that can also cause discrepancies in treating different groups. This paper addresses the challenges of defining, approximating, and utilizing a new subgroup structure for fair spam detection. We first identify subgroup structures in the review graph that lead to discrepant accuracy in the groups. The complex dependencies over the review graph create difficulties in teasing out subgroups hidden within larger groups. We design a model that can be trained to jointly infer the hidden subgroup memberships and exploits the membership for calibrating the detection accuracy across groups. Comprehensive comparisons against baselines on three large Yelp review datasets demonstrate that the subgroup membership can be identified and exploited for group fairness.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源