蒙特卡洛树搜索和风险指标，在不确定环境中的合作轨迹计划

论文标题

蒙特卡洛树搜索和风险指标，在不确定环境中的合作轨迹计划

Cooperative Trajectory Planning in Uncertain Environments with Monte Carlo Tree Search and Risk Metrics

论文作者

Stegmaier, Philipp, Kurzer, Karl, Zöllner, J. Marius

论文摘要

自动化车辆需要与人类合作的能力，使其平稳地整合到当今的交通中。虽然合作的概念是众所周知的，但开发出强大而有效的合作轨迹计划方法仍然是一个挑战。这一挑战的一个方面是由于传感器的准确性有限，围绕环境状态的不确定性。这种不确定性可以通过部分可观察到的马尔可夫决策过程来表示。我们的工作通过基于蒙特卡洛树搜索连续的动作空间来扩展现有的合作轨迹计划方法来解决此问题。它通过以根信仰状态的形式对不确定性进行明确建模，从中对树的起始状态进行采样。在用蒙特卡洛树搜索构造树木后，它们的结果将使用内核回归汇总为返回分布。我们将两个风险指标应用于最终选择，即较低的置信度和有条件的价值处于风险。可以证明，最终选择策略中风险指标的整合始终优于不确定环境中的基线，从而产生了相当安全的轨迹。

Automated vehicles require the ability to cooperate with humans for smooth integration into today's traffic. While the concept of cooperation is well known, developing a robust and efficient cooperative trajectory planning method is still a challenge. One aspect of this challenge is the uncertainty surrounding the state of the environment due to limited sensor accuracy. This uncertainty can be represented by a Partially Observable Markov Decision Process. Our work addresses this problem by extending an existing cooperative trajectory planning approach based on Monte Carlo Tree Search for continuous action spaces. It does so by explicitly modeling uncertainties in the form of a root belief state, from which start states for trees are sampled. After the trees have been constructed with Monte Carlo Tree Search, their results are aggregated into return distributions using kernel regression. We apply two risk metrics for the final selection, namely a Lower Confidence Bound and a Conditional Value at Risk. It can be demonstrated that the integration of risk metrics in the final selection policy consistently outperforms a baseline in uncertain environments, generating considerably safer trajectories.

下载PDF全文

下载文献需遵守相关版权规定

论文标题