论文标题
定期观察计划:基于界限和基于边界的解决方案
Planning under periodic observations: bounds and bounding-based solutions
论文作者
论文摘要
我们研究了在不确定的环境中运行的机器人面临的计划问题,对国家知识不完整,嘈杂和/或不精确的行动。本文确定了一个新的问题子类,该阶级模拟了设置信息,其中仅通过某些外源过程间歇性地揭示了信息,该过程会定期提供状态信息。几个实用领域符合该模型,包括激发我们研究的特定情况:远程成像增强行星探索的自动导航。为了关注有效的专业解决方案方法,我们检查了该子类实例的结构。它们导致马尔可夫的决策过程具有指数较大的动作空间,但是当这些动作包含更多原子元素的序列时,可以通过比较不同信息假设下的策略来建立绩效界限。这提供了一种系统地构建性能界限的方法。这样的界限很有用,因为与它们赋予的见解结合在一起,它们可以采用基于边界的方法来有效地获得高质量的解决方案。我们提出的经验结果证明了它们对所考虑的问题的有效性。上述内容还提到了时间时间为这些问题所扮演的独特作用 - 更具体地说:直到信息揭示的时间 - 我们在这方面发现并讨论了几个有趣的微妙之处。
We study planning problems faced by robots operating in uncertain environments with incomplete knowledge of state, and actions that are noisy and/or imprecise. This paper identifies a new problem sub-class that models settings in which information is revealed only intermittently through some exogenous process that provides state information periodically. Several practical domains fit this model, including the specific scenario that motivates our research: autonomous navigation of a planetary exploration rover augmented by remote imaging. With an eye to efficient specialized solution methods, we examine the structure of instances of this sub-class. They lead to Markov Decision Processes with exponentially large action-spaces but for which, as those actions comprise sequences of more atomic elements, one may establish performance bounds by comparing policies under different information assumptions. This provides a way in which to construct performance bounds systematically. Such bounds are useful because, in conjunction with the insights they confer, they can be employed in bounding-based methods to obtain high-quality solutions efficiently; the empirical results we present demonstrate their effectiveness for the considered problems. The foregoing has also alluded to the distinctive role that time plays for these problems -- more specifically: time until information is revealed -- and we uncover and discuss several interesting subtleties in this regard.