论文标题
LDP-FPMiner:基于FP-TREE的频繁项目集挖掘与局部差异隐私
LDP-FPMiner: FP-Tree Based Frequent Itemset Mining with Local Differential Privacy
论文作者
论文摘要
在局部差异隐私设置(LDP)设置中的数据聚合通过提供敏感数据的合理可否认性来确保强大的隐私。现有关于此问题的作品主要集中在发现重型击球手,而将频繁的项目集挖掘(FIM)的任务作为一个开放的问题。据我们所知,FIM的最新状态LDP解决方案是最近提出的SVSM协议。 SVSM协议主要基于基于填充和采样的频率Oracle(PSFO)协议,并将项目集视为独立项目,而无需考虑项目集之间的频率一致性。 在本文中,我们提出了一种基于频繁图案树(FP-Tree)的FIM的新型LDP方法,称为LDP-FPMiner。我们的提案通过使用LDP构建和优化嘈杂的FP-Tree来利用项目集之间的频率一致性。具体来说,它的工作如下。首先,确定了最常见的项目,并相应地切除项目域。其次,估计FP-树的最大水平。第三,通过使用项目集频率一致性构建和优化了嘈杂的FP-Tree,然后开采以获得k最常见的项目集。实验结果表明,LDP-FPMiner在最先进的方法SVSM上显着改善,尤其是在高隐私水平的情况下。
Data aggregation in the setting of local differential privacy (LDP) guarantees strong privacy by providing plausible deniability of sensitive data. Existing works on this issue mostly focused on discovering heavy hitters, leaving the task of frequent itemset mining (FIM) as an open problem. To the best of our knowledge, the-state-of-the-art LDP solution to FIM is the SVSM protocol proposed recently. The SVSM protocol is mainly based on the padding and sampling based frequency oracle (PSFO) protocol, and regarded an itemset as an independent item without considering the frequency consistency among itemsets. In this paper, we propose a novel LDP approach to FIM called LDP-FPMiner based on frequent pattern tree (FP-tree). Our proposal exploits frequency consistency among itemsets by constructing and optimizing a noisy FP-tree with LDP. Specifically, it works as follows. First, the most frequent items are identified, and the item domain is cut down accordingly. Second, the maximum level of the FP-tree is estimated. Third, a noisy FP-tree is constructed and optimized by using itemset frequency consistency, and then mined to obtain the k most frequent itemsets. Experimental results show that the LDP-FPMiner significantly improves over the state-of-the-art approach, SVSM, especially in the case of a high privacy level.