论文标题
Tripletre:黑匣子代理及其环境的多功能性表示
TripleTree: A Versatile Interpretable Representation of Black Box Agents and their Environments
论文作者
论文摘要
在可解释的人工智能中,人们对理解自主代理人在建立信任和验证绩效的行为方面越来越兴趣。现代的代理体系结构,例如经过深入强化学习的训练,目前缺乏可解释的结构,以至于有效地是黑匣子,但是从外部的,行为主义者的角度来看,洞察力仍然可以获得。受概念空间理论的启发,我们建议迈向一般理解的多功能第一步是将状态空间离散到凸区域,共同捕获观测数据集中代理人的动作,价值函数和时间动态的相似之处。我们使用CART决策树算法的新型变体创建了这样的表示,并演示了它如何通过预测,可视化和基于规则的解释来促进对黑匣子代理的实际理解。
In explainable artificial intelligence, there is increasing interest in understanding the behaviour of autonomous agents to build trust and validate performance. Modern agent architectures, such as those trained by deep reinforcement learning, are currently so lacking in interpretable structure as to effectively be black boxes, but insights may still be gained from an external, behaviourist perspective. Inspired by conceptual spaces theory, we suggest that a versatile first step towards general understanding is to discretise the state space into convex regions, jointly capturing similarities over the agent's action, value function and temporal dynamics within a dataset of observations. We create such a representation using a novel variant of the CART decision tree algorithm, and demonstrate how it facilitates practical understanding of black box agents through prediction, visualisation and rule-based explanation.