论文标题
十六进制:通过深度强化学习的人类解释性
HEX: Human-in-the-loop Explainability via Deep Reinforcement Learning
论文作者
论文摘要
在决策环境中使用机器学习(ML)模型,尤其是在高风险决策中使用的模型,因为一个人(不是机器)最终必须对使用此类系统做出的决策的后果负责,这是有问题和危险的。机器学习解释性(MLX)有望为决策者提供特定于预测的基本原理,并确保他们出于正确的理由进行模型引用的预测,因此是可靠的。但是,很少有作品明确考虑这一关键的人类(hitl)组件。在这项工作中,我们提出了十六进制,这是一种对MLX的人类深入的强化学习方法。十六进制将0-分配投影纳入合成任何任意分类模型的决定者特定说明的策略。 HEX还构建了以有限或减少的培训数据情景(例如使用联合学习的人)进行操作。我们的公式明确考虑了所讨论的ML模型的决策边界,而不是基础训练数据,这是许多模型不可吻合的MLX方法的缺点。因此,我们提出的方法合成了hitl MLX策略,这些策略明确捕获了所讨论的模型的决策边界,以在有限的数据方案中使用。
The use of machine learning (ML) models in decision-making contexts, particularly those used in high-stakes decision-making, are fraught with issue and peril since a person - not a machine - must ultimately be held accountable for the consequences of the decisions made using such systems. Machine learning explainability (MLX) promises to provide decision-makers with prediction-specific rationale, assuring them that the model-elicited predictions are made for the right reasons and are thus reliable. Few works explicitly consider this key human-in-the-loop (HITL) component, however. In this work we propose HEX, a human-in-the-loop deep reinforcement learning approach to MLX. HEX incorporates 0-distrust projection to synthesize decider specific explanation-providing policies from any arbitrary classification model. HEX is also constructed to operate in limited or reduced training data scenarios, such as those employing federated learning. Our formulation explicitly considers the decision boundary of the ML model in question, rather than the underlying training data, which is a shortcoming of many model-agnostic MLX methods. Our proposed methods thus synthesize HITL MLX policies that explicitly capture the decision boundary of the model in question for use in limited data scenarios.