通过NASH增强学习的强大垃圾邮件调查

论文标题

通过NASH增强学习的强大垃圾邮件调查

Robust Spammer Detection by Nash Reinforcement Learning

论文作者

Dou, Yingtong, Ma, Guixiang, Yu, Philip S., Xie, Sihong

论文摘要

在线评论为客户做出决策提供了产品评估。不幸的是，专业垃圾邮件发送者可以使用虚假评论（“垃圾邮件”）来操纵评估，他们通过适应已部署的检测器来学习越来越阴险和有力的垃圾邮件策略。垃圾邮件策略很难捕获，因为它们可能会随着时间的流逝而迅速变化，在垃圾邮件中和目标产品之间有所不同，而且在大多数情况下，更重要的是，在大多数情况下仍然是未知的。此外，大多数现有的检测器都集中在检测准确性上，而该检测准确性尚未达到良好状态，目的是保持产品评估的可信度。为了应对挑战，我们制定了一个Minimax游戏，垃圾邮件和垃圾邮件探测器相互竞争他们的实际目标，而这些目标不仅基于检测准确性。游戏的纳什平衡导致稳定的探测器对任何混合检测策略不可知。但是，该游戏没有封闭形式的解决方案，并且无法换取典型的基于梯度的算法。我们将游戏转变为两个依赖的马尔可夫决策过程（MDP），以允许基于多武器的强盗和政策梯度的有效随机优化。我们使用各种最先进的垃圾邮件和检测策略对三个大型审查数据集进行了实验，并表明优化算法可以可靠地找到一个平衡的检测器，该检测器可以坚固有效地防止具有任何混合垃圾邮件发送策略实现其实际目标。我们的代码可在https://github.com/yingtongdou/nash-detect上找到。

Online reviews provide product evaluations for customers to make decisions. Unfortunately, the evaluations can be manipulated using fake reviews ("spams") by professional spammers, who have learned increasingly insidious and powerful spamming strategies by adapting to the deployed detectors. Spamming strategies are hard to capture, as they can be varying quickly along time, different across spammers and target products, and more critically, remained unknown in most cases. Furthermore, most existing detectors focus on detection accuracy, which is not well-aligned with the goal of maintaining the trustworthiness of product evaluations. To address the challenges, we formulate a minimax game where the spammers and spam detectors compete with each other on their practical goals that are not solely based on detection accuracy. Nash equilibria of the game lead to stable detectors that are agnostic to any mixed detection strategies. However, the game has no closed-form solution and is not differentiable to admit the typical gradient-based algorithms. We turn the game into two dependent Markov Decision Processes (MDPs) to allow efficient stochastic optimization based on multi-armed bandit and policy gradient. We experiment on three large review datasets using various state-of-the-art spamming and detection strategies and show that the optimization algorithm can reliably find an equilibrial detector that can robustly and effectively prevent spammers with any mixed spamming strategies from attaining their practical goal. Our code is available at https://github.com/YingtongDou/Nash-Detect.

下载PDF全文

下载文献需遵守相关版权规定

论文标题