模型不可知论解释的解释性因果效应

论文标题

模型不可知论解释的解释性因果效应

Explanatory causal effects for model agnostic explanations

论文作者

Li, Jiuyong, Tran, Ha Xuan, Le, Thuc Duy, Liu, Lin, Yu, Kui, Liu, Jixue

论文摘要

本文研究了通过机器学习模型估计特征对特定实例预测的贡献的问题，以及功能对模型的总体贡献。特征（变量）对预测结果的因果效应反映了该特征对预测的贡献。一个挑战是，如果没有已知的因果图，就无法从数据中估算大多数现有的因果效应。在本文中，我们根据假设的理想实验定义了解释性因果效应。该定义给不可知论的解释带来了一些好处。首先，解释是透明的，具有因果关系。其次，解释性因果效应估计可以是数据驱动的。第三，因果效应既提供了特定预测的局部解释，又提供了一个全局解释，显示了一个在预测模型中特征的总体重要性。我们进一步提出了一种基于解释性因果效应来解释的方法和组合变量的方法。我们显示了对某些现实世界数据集的实验的定义和方法。

This paper studies the problem of estimating the contributions of features to the prediction of a specific instance by a machine learning model and the overall contribution of a feature to the model. The causal effect of a feature (variable) on the predicted outcome reflects the contribution of the feature to a prediction very well. A challenge is that most existing causal effects cannot be estimated from data without a known causal graph. In this paper, we define an explanatory causal effect based on a hypothetical ideal experiment. The definition brings several benefits to model agnostic explanations. First, explanations are transparent and have causal meanings. Second, the explanatory causal effect estimation can be data driven. Third, the causal effects provide both a local explanation for a specific prediction and a global explanation showing the overall importance of a feature in a predictive model. We further propose a method using individual and combined variables based on explanatory causal effects for explanations. We show the definition and the method work with experiments on some real-world data sets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题