论文标题
通过整数线性编程解决贝叶斯网络结构学习问题
Solving Bayesian Network Structure Learning Problem with Integer Linear Programming
论文作者
论文摘要
本文研究了贝叶斯网络结构学习问题的整数线性编程(ILP)公式。我们回顾贝叶斯网络的定义和关键属性,并解释用于衡量某些贝叶斯网络结构对数据集的程度的分数指标。我们根据分数指标的可分解性概述了整数线性编程公式。为了确保结构的循环性,我们添加了专门针对贝叶斯网络开发的``集群约束'',除了适用于定向无环形图的循环约束外。由于这些约束将有指数级的数量,如果我们完全指定它们,我们将解释将它们添加为切割平面而无需在初始模型中声明它们的方法。另外,我们开发了一种启发式算法,该算法基于有向无环图的下沉节点的想法找到可行的解决方案。我们将ILP公式和切割平面实现为\ textsf {python}软件包,并在参考数据集上使用不同的设置进行了实验结果。
This dissertation investigates integer linear programming (ILP) formulation of Bayesian Network structure learning problem. We review the definition and key properties of Bayesian network and explain score metrics used to measure how well certain Bayesian network structure fits the dataset. We outline the integer linear programming formulation based on the decomposability of score metrics. In order to ensure acyclicity of the structure, we add ``cluster constraints'' developed specifically for Bayesian network, in addition to cycle constraints applicable to directed acyclic graphs in general. Since there would be exponential number of these constraints if we specify them fully, we explain the methods to add them as cutting planes without declaring them all in the initial model. Also, we develop a heuristic algorithm that finds a feasible solution based on the idea of sink node on directed acyclic graphs. We implemented the ILP formulation and cutting planes as a \textsf{Python} package, and present the results of experiments with different settings on reference datasets.