论文标题
通过深度学习推荐系统中稀疏特征的访问模式进行数据泄漏
Data Leakage via Access Patterns of Sparse Features in Deep Learning-based Recommendation Systems
论文作者
论文摘要
在线个性化推荐服务通常托管在云中,用户在其中查询基于云的模型以接收推荐的输入,例如感兴趣的商品或新闻提要。最先进的建议模型依靠稀疏和密集的功能来表示用户的个人资料信息以及与之交互的项目。尽管稀疏功能占总型号大小的99%,但通过稀疏功能,没有足够的注意来对潜在的信息泄漏。这些稀疏功能用于跟踪用户的行为,例如,单击历史记录,对象交互等,可能会携带每个用户的私人信息。稀疏的功能表示为存储在大表中的学习嵌入向量,并通过使用特定用户的稀疏功能通过表索引来执行个性化建议。即使最近提供的方法隐藏了云中发生的计算,云中的攻击者也可能仍然能够跟踪嵌入表的访问模式。本文通过跟踪推荐模型的稀疏特征访问模式来探讨可以学习的私人信息。我们首先表征了不信任云中推荐模型中可以在稀疏功能上进行的攻击类型,然后演示这些攻击如何导致如何通过其行为随着时间的推移来提取用户的私人信息或跟踪用户。
Online personalized recommendation services are generally hosted in the cloud where users query the cloud-based model to receive recommended input such as merchandise of interest or news feed. State-of-the-art recommendation models rely on sparse and dense features to represent users' profile information and the items they interact with. Although sparse features account for 99% of the total model size, there was not enough attention paid to the potential information leakage through sparse features. These sparse features are employed to track users' behavior, e.g., their click history, object interactions, etc., potentially carrying each user's private information. Sparse features are represented as learned embedding vectors that are stored in large tables, and personalized recommendation is performed by using a specific user's sparse feature to index through the tables. Even with recently-proposed methods that hides the computation happening in the cloud, an attacker in the cloud may be able to still track the access patterns to the embedding tables. This paper explores the private information that may be learned by tracking a recommendation model's sparse feature access patterns. We first characterize the types of attacks that can be carried out on sparse features in recommendation models in an untrusted cloud, followed by a demonstration of how each of these attacks leads to extracting users' private information or tracking users by their behavior over time.