论文标题
壁画:简单复合物中的频繁模式
FreSCo: Mining Frequent Patterns in Simplicial Complexes
论文作者
论文摘要
简单的复合物是对高阶关系建模图的概括。在本文中,我们介绍了简单的模式(我们称为简单的模式),并将频繁挖掘的任务从图形的范围推广到Simplicial Complextes。由于巨大的搜索空间以及对高阶同构的需求,我们的任务尤其具有挑战性。我们表明,在线性时间和最多二次空间中,找到复合物中的简单出现可以简化为二分图同构问题。然后,我们提出了一种反主体频率度量,该测量使我们能够从小型简单群开始探索,并在其频率降至最小频率阈值以下后立即停止扩展。配备了这些想法和巧妙的数据结构,我们开发了一种具有内存意识的算法,通过仔细利用复杂和简单中简单之间的关系,可以为我们的复杂采矿任务实现效率和可扩展性。我们的算法壁画有两种口味:它可以计算简单的精确频率,或者更快地,它可以确定Simplet是否频繁,而无需计算精确的频率。实验结果证明了壁画能够在各种大小和维度的复合物中开采频繁的简单,以及简单对传统图形模式的重要性。
Simplicial complexes are a generalization of graphs that model higher-order relations. In this paper, we introduce simplicial patterns -- that we call simplets -- and generalize the task of frequent pattern mining from the realm of graphs to that of simplicial complexes. Our task is particularly challenging due to the enormous search space and the need for higher-order isomorphism. We show that finding the occurrences of simplets in a complex can be reduced to a bipartite graph isomorphism problem, in linear time and at most quadratic space. We then propose an anti-monotonic frequency measure that allows us to start the exploration from small simplets and stop expanding a simplet as soon as its frequency falls below the minimum frequency threshold. Equipped with these ideas and a clever data structure, we develop a memory-conscious algorithm that, by carefully exploiting the relationships among the simplices in the complex and among the simplets, achieves efficiency and scalability for our complex mining task. Our algorithm, FreSCo, comes in two flavors: it can compute the exact frequency of the simplets or, more quickly, it can determine whether a simplet is frequent, without having to compute the exact frequency. Experimental results prove the ability of FreSCo to mine frequent simplets in complexes of various size and dimension, and the significance of the simplets with respect to the traditional graph patterns.