论文标题

使用Fisher的不等式将加权图的分解更快地分解为集团

Faster Decomposition of Weighted Graphs into Cliques using Fisher's Inequality

论文作者

Jain, Shweta, Mizutani, Yosuke, Sullivan, Blair

论文摘要

在生物医学研究中,始终如一地表达的基因的采矿基因是一个重要的问题,在诸如抑制药物和设计新疾病治疗之类的应用中至关重要。最近,Cooley等。将这个问题建模为精确的加权集团分解(EWCD),其中,给定具有边缘加权的图形$ g $和一个正整数$ k $,目标是将$ g $分解为最多$ k $(重叠的)加权集团,以使边缘的权重完全等于cliques for cliques in comparate in niver in-ewcore in-ewcore in-ewcram in-ewccd。 $ 4^K $ -KERNEL与回溯算法(称为Cricca)一起迭代构建分解。不幸的是,由于潜在解决方案空间中固有的指数增长,Cricca通常只能在$ k \ leq 11 $时才能分解图。 在这项工作中,我们建立了减少规则,该规则将EWCD的内核大小(从$ 4^k $降至$ k2^k $)。此外,我们还使用有关潜在解决方案结构的见解来提供新的搜索规则,以加快分解算法的速度。我们技术的核心是组合设计理论的结果,称为Fisher的不平等,表征了与限制交叉点的设定系统。我们在生物学启发的数据的语料库上部署了内核化和分解算法(一起称为DECAF),并在克里卡卡(Cricca)上获得了超过两个数量级的速度。结果,DePaf缩放为$ k \ geq 17 $的实例。

Mining groups of genes that consistently co-express is an important problem in biomedical research, where it is critical for applications such as drug-repositioning and designing new disease treatments. Recently, Cooley et al. modeled this problem as Exact Weighted Clique Decomposition (EWCD) in which, given an edge-weighted graph $G$ and a positive integer $k$, the goal is to decompose $G$ into at most $k$ (overlapping) weighted cliques so that an edge's weight is exactly equal to the sum of weights for cliques it participates in. They show EWCD is fixed-parameter-tractable, giving a $4^k$-kernel alongside a backtracking algorithm (together called cricca) to iteratively build a decomposition. Unfortunately, because of inherent exponential growth in the space of potential solutions, cricca is typically able to decompose graphs only when $k \leq 11$. In this work, we establish reduction rules that exponentially decrease the size of the kernel (from $4^k$ to $k2^k$) for EWCD. In addition, we use insights about the structure of potential solutions to give new search rules that speed up the decomposition algorithm. At the core of our techniques is a result from combinatorial design theory called Fisher's inequality characterizing set systems with restricted intersections. We deploy our kernelization and decomposition algorithms (together called DeCAF) on a corpus of biologically-inspired data and obtain over two orders of magnitude speed-up over cricca. As a result, DeCAF scales to instances with $k \geq 17$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源