论文标题
在推荐系统暴露的用户行为泄漏上
On the User Behavior Leakage from Recommender System Exposure
论文作者
论文摘要
对现代推荐系统进行了培训,以预测用户从用户历史行为数据中潜在的未来互动。在交互过程中,尽管来自用户端推荐系统的数据也会生成曝光数据,以向用户提供个性化的建议板。与稀疏的用户行为数据相比,系统曝光数据的音量要大得多,因为用户只会单击很少的曝光项目。此外,用户历史行为数据对隐私敏感,通常受到仔细访问授权的保护。但是,大量的推荐曝光数据通常受到关注较少,并且可以在相对较大的各种信息寻求者的范围内访问。在本文中,我们研究了推荐系统中用户行为泄漏的问题。我们表明,可以通过建模系统曝光来推断隐私敏感的用户过去行为数据。此外,人们可以从对该用户的当前系统曝光观察中点击哪些项目。鉴于可以从相对较大的范围中广泛访问系统曝光数据的事实,我们认为用户过去的行为隐私在推荐系统中具有很高的泄漏风险。更确切地说,我们进行了一个攻击模型,其输入是当前推荐的项目板岩(即系统曝光),而输出是用户的历史行为。两个现实世界数据集的实验结果表明,用户行为泄漏的危险很大。为了解决风险,我们提出了一种两阶段的隐私保护机制,该机制首先从曝光板岩中选择一部分项目,然后用统一或基于普通的曝光来代替所选项目。实验评估揭示了建议准确性与隐私披露风险之间的权衡效果。
Modern recommender systems are trained to predict users potential future interactions from users historical behavior data. During the interaction process, despite the data coming from the user side recommender systems also generate exposure data to provide users with personalized recommendation slates. Compared with the sparse user behavior data, the system exposure data is much larger in volume since only very few exposed items would be clicked by the user. Besides, the users historical behavior data is privacy sensitive and is commonly protected with careful access authorization. However, the large volume of recommender exposure data usually receives less attention and could be accessed within a relatively larger scope of various information seekers. In this paper, we investigate the problem of user behavior leakage in recommender systems. We show that the privacy sensitive user past behavior data can be inferred through the modeling of system exposure. Besides, one can infer which items the user have clicked just from the observation of current system exposure for this user. Given the fact that system exposure data could be widely accessed from a relatively larger scope, we believe that the user past behavior privacy has a high risk of leakage in recommender systems. More precisely, we conduct an attack model whose input is the current recommended item slate (i.e., system exposure) for the user while the output is the user's historical behavior. Experimental results on two real-world datasets indicate a great danger of user behavior leakage. To address the risk, we propose a two-stage privacy-protection mechanism which firstly selects a subset of items from the exposure slate and then replaces the selected items with uniform or popularity-based exposure. Experimental evaluation reveals a trade-off effect between the recommendation accuracy and the privacy disclosure risk.