论文标题
贝叶斯点规则集学习
Bayes Point Rule Set Learning
论文作者
论文摘要
可解释性在机器学习算法的设计中起着越来越重要的作用。但是,可解释的方法往往不如其黑框对应物准确。除其他外,DNF(分离的正常形式)可以说是表达一组规则的最容易解释的方式。在本文中,我们建议对流行的Find-S算法进行有效的自下而上扩展,以学习DNF型规则集。该算法贪婪地找到了积极示例的分区。生产的DNF是一组连词规则,每个规则对应于与正面和所有负面示例的一部分一致的最具体规则。我们还提出了该方法的两个原则扩展,通过汇总DNF决策规则,近似贝叶斯最佳分类器。最后,我们提供了一种方法,可以在保留其概括能力的同时显着提高学习规则的解释性。与几个基准数据集上的最先进的符号和统计方法进行了广泛的比较表明,我们的建议在解释性和准确性之间提供了极好的平衡。
Interpretability is having an increasingly important role in the design of machine learning algorithms. However, interpretable methods tend to be less accurate than their black-box counterparts. Among others, DNFs (Disjunctive Normal Forms) are arguably the most interpretable way to express a set of rules. In this paper, we propose an effective bottom-up extension of the popular FIND-S algorithm to learn DNF-type rulesets. The algorithm greedily finds a partition of the positive examples. The produced DNF is a set of conjunctive rules, each corresponding to the most specific rule consistent with a part of positive and all negative examples. We also propose two principled extensions of this method, approximating the Bayes Optimal Classifier by aggregating DNF decision rules. Finally, we provide a methodology to significantly improve the explainability of the learned rules while retaining their generalization capabilities. An extensive comparison with state-of-the-art symbolic and statistical methods on several benchmark data sets shows that our proposal provides an excellent balance between explainability and accuracy.