逆增强学习中的错误指定

论文标题

逆增强学习中的错误指定

Misspecification in Inverse Reinforcement Learning

论文作者

Skalse, Joar, Abate, Alessandro

论文摘要

逆增强学习（IRL）的目的是从政策$π$中推断出奖励功能$ r $。为此，我们需要一个$π$与$ r $的模型。在当前文献中，最常见的模型是最佳，玻尔兹曼的理性和因果熵最大化。 IRL背后的主要动机之一是从人类行为中推断人类的偏好。但是，人类偏好与人类行为之间的真实关系比IRL中当前使用的任何模型都要复杂得多。这意味着它们被弄清楚了，这引起了人们的担忧，即如果应用于现实世界数据，它们可能会导致不健全的推论。在本文中，我们提供了数学分析，分析了不同的IRL模型对错误指定的核能，并确切地回答了演示者策略与每个标准模型的不同之处，然后该模型导致有关奖励功能$ r $的错误推断。我们还介绍了一个有关IRL中错误指定的推理的框架，以及可用于轻松得出新IRL模型的错误指定鲁棒性的正式工具。

The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function $R$ from a policy $π$. To do this, we need a model of how $π$ relates to $R$. In the current literature, the most common models are optimality, Boltzmann rationality, and causal entropy maximisation. One of the primary motivations behind IRL is to infer human preferences from human behaviour. However, the true relationship between human preferences and human behaviour is much more complex than any of the models currently used in IRL. This means that they are misspecified, which raises the worry that they might lead to unsound inferences if applied to real-world data. In this paper, we provide a mathematical analysis of how robust different IRL models are to misspecification, and answer precisely how the demonstrator policy may differ from each of the standard models before that model leads to faulty inferences about the reward function $R$. We also introduce a framework for reasoning about misspecification in IRL, together with formal tools that can be used to easily derive the misspecification robustness of new IRL models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题