因果世界模型通过无监督的身体动态解体

论文标题

因果世界模型通过无监督的身体动态解体

Causal World Models by Unsupervised Deconfounding of Physical Dynamics

论文作者

Li, Minne, Yang, Mengyue, Liu, Furui, Chen, Xu, Chen, Zhitang, Wang, Jun

论文摘要

用世界心理模型内部想象的能力对于人类认知至关重要。如果机器智能代理可以学习世界模型来创建“梦想”环境，则可以在内部提出什么问题 - 模拟过去尚未经历的替代期货 - 并相应地做出最佳决定。现有的世界模型通常是通过学习从过去的感官信号中嵌入的时空规律性来确定的，而无需考虑影响状态过渡动态的混杂因素。因此，如果采取了一定的行动政策，他们无法回答有关“将会发生的事情”的关键反事实问题。在本文中，我们提出了因果世界模型（CWMS），以通过学习潜在混杂因素的估计量来允许对中间观察结果与替代期货之间的关系进行无监督的建模。我们从经验上评估了我们的方法，并证明了其在各种物理推理环境中的有效性。具体而言，我们显示了用于强化学习任务的样本复杂性的降低以及反事实物理推理的改进。

The capability of imagining internally with a mental model of the world is vitally important for human cognition. If a machine intelligent agent can learn a world model to create a "dream" environment, it can then internally ask what-if questions -- simulate the alternative futures that haven't been experienced in the past yet -- and make optimal decisions accordingly. Existing world models are established typically by learning spatio-temporal regularities embedded from the past sensory signal without taking into account confounding factors that influence state transition dynamics. As such, they fail to answer the critical counterfactual questions about "what would have happened" if a certain action policy was taken. In this paper, we propose Causal World Models (CWMs) that allow unsupervised modeling of relationships between the intervened observations and the alternative futures by learning an estimator of the latent confounding factors. We empirically evaluate our method and demonstrate its effectiveness in a variety of physical reasoning environments. Specifically, we show reductions in sample complexity for reinforcement learning tasks and improvements in counterfactual physical reasoning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题