论文标题

非线性预测函数的边际效应

Marginal Effects for Non-Linear Prediction Functions

论文作者

Scholbeck, Christian A., Casalicchio, Giuseppe, Molnar, Christoph, Bischl, Bernd, Heumann, Christian

论文摘要

线性回归模型的β系数代表了可解释特征效应的理想形式。但是,对于非线性模型,尤其是普遍的线性模型,估计系数不能被解释为对预测结果的直接特征效应。因此,边际效应通常用作特征效应的近似值,要么以预测函数的衍生物形状或由于特征值的变化而导致的预测差异。尽管边缘效应通常在许多科学领域都使用,但尚未将它们用作机器学习模型的模型敏锐解释方法。这可能源于它们作为单变量特征效应的僵硬性以及无法处理黑匣子模型中发现的非线性性。我们引入了一类新的边缘效应,称为前向边际效应。我们主张放弃衍生品,而支持更好的远期差异。此外,我们基于正向差异对特征值的多元变化概括了边缘效应。为了说明预测函数的非线性性,我们引入了一个非线性措施以实现边际效应。我们反对在单个度量中(例如平均边缘效应)中非线性预测函数的特征效应。取而代之的是,我们建议将特征空间分开,以计算特征子空间的条件平均边际影响,这是有条件的特征效应估计。

Beta coefficients for linear regression models represent the ideal form of an interpretable feature effect. However, for non-linear models and especially generalized linear models, the estimated coefficients cannot be interpreted as a direct feature effect on the predicted outcome. Hence, marginal effects are typically used as approximations for feature effects, either in the shape of derivatives of the prediction function or forward differences in prediction due to a change in a feature value. While marginal effects are commonly used in many scientific fields, they have not yet been adopted as a model-agnostic interpretation method for machine learning models. This may stem from their inflexibility as a univariate feature effect and their inability to deal with the non-linearities found in black box models. We introduce a new class of marginal effects termed forward marginal effects. We argue to abandon derivatives in favor of better-interpretable forward differences. Furthermore, we generalize marginal effects based on forward differences to multivariate changes in feature values. To account for the non-linearity of prediction functions, we introduce a non-linearity measure for marginal effects. We argue against summarizing feature effects of a non-linear prediction function in a single metric such as the average marginal effect. Instead, we propose to partition the feature space to compute conditional average marginal effects on feature subspaces, which serve as conditional feature effect estimates.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源