论文标题
通过探索儿童和计算模型来学习因果疏调
Learning Causal Overhypotheses through Exploration in Children and Computational Models
论文作者
论文摘要
尽管增强学习最近进展(RL),但用于探索的RL算法仍然是一个积极的研究领域。现有方法通常集中在基于州的指标上,这些指标不考虑环境的基本因果结构,尽管最近的研究开始探索因果学习的RL环境,但这些环境主要通过因果推理或归纳或探索来利用因果信息,而不是探索。相比之下,人类儿童(一些最精通的探险家)已被证明使用因果信息,从而有益。在这项工作中,我们介绍了一种具有可控因果结构的新型RL环境,使我们能够评估统一环境中代理商和儿童使用的探索策略。此外,通过对计算模型和儿童的实验,我们证明了在因果环境中信息获得最佳RL探索与在同一环境中儿童的探索之间存在显着差异。最后,我们讨论了这些发现如何激发新的研究方向,以对RL算法的因果结构有效探索和歧义。
Despite recent progress in reinforcement learning (RL), RL algorithms for exploration still remain an active area of research. Existing methods often focus on state-based metrics, which do not consider the underlying causal structures of the environment, and while recent research has begun to explore RL environments for causal learning, these environments primarily leverage causal information through causal inference or induction rather than exploration. In contrast, human children - some of the most proficient explorers - have been shown to use causal information to great benefit. In this work, we introduce a novel RL environment designed with a controllable causal structure, which allows us to evaluate exploration strategies used by both agents and children in a unified environment. In addition, through experimentation on both computation models and children, we demonstrate that there are significant differences between information-gain optimal RL exploration in causal environments and the exploration of children in the same environments. We conclude with a discussion of how these findings may inspire new directions of research into efficient exploration and disambiguation of causal structures for RL algorithms.