论文标题

TED:旨在在图数据库中发现顶级边缘变化模式

TED: Towards Discovering Top-k Edge-Diversified Patterns in a Graph Database

论文作者

Huang, Kai, Hu, Haibo, Ye, Qingqing, Tian, Kai, Zheng, Bolong, Zhou, Xiaofang

论文摘要

由于来自不同存储库的图形数量呈指数增长,因此非常需要分析包含大量中小型数据图(例如化合物)的图形数据库。尽管已经提出了子图枚举和子图挖掘,以通过一组子图结构将洞察力带入图形数据库中,但它们通常最终得到相似或同质的拓扑,这在许多图应用程序中是不受欢迎的。为了解决此限制,我们提出了TOP-K边缘变化模式发现问题,以检索涵盖数据库中最大边缘数量的一组子图。为了有效地处理此类查询,我们提出了一个称为TED的通用且可扩展的框架,该框架与最佳结果相近似。进一步制定了两种优化策略以提高性能。关于现实世界数据集的实验研究证明了TED对传统技术的优越性。

With an exponentially growing number of graphs from disparate repositories, there is a strong need to analyze a graph database containing an extensive collection of small- or medium-sized data graphs (e.g., chemical compounds). Although subgraph enumeration and subgraph mining have been proposed to bring insights into a graph database by a set of subgraph structures, they often end up with similar or homogenous topologies, which is undesirable in many graph applications. To address this limitation, we propose the Top-k Edge-Diversified Patterns Discovery problem to retrieve a set of subgraphs that cover the maximum number of edges in a database. To efficiently process such query, we present a generic and extensible framework called Ted which achieves a guaranteed approximation ratio to the optimal result. Two optimization strategies are further developed to improve the performance. Experimental studies on real-world datasets demonstrate the superiority of Ted to traditional techniques.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源