在避免自动驾驶汽车的碰撞任务中，对类似人类的驾驶政策进行深入的强化学习

论文标题

在避免自动驾驶汽车的碰撞任务中，对类似人类的驾驶政策进行深入的强化学习

Deep Reinforcement Learning for Human-Like Driving Policies in Collision Avoidance Tasks of Self-Driving Cars

论文作者

Emuna, Ran, Borowsky, Avinoam, Biess, Armin

论文摘要

目前，对于许多汽车公司和研究实验室而言，自动驾驶汽车（AV）涉及的技术和科学挑战目前是主要兴趣。但是，人类控制的车辆可能会在未来的几十年中留在道路上，并可能与AVS共享未来的交通环境。在这种混合环境中，AVS应部署类似人类的驾驶政策和谈判技巧，以使交通流平稳。为了产生自动化的人类驾驶政策，我们引入了一种无模型，深厚的增强学习方法，以模仿经验丰富的人类驾驶员的行为。我们在模拟（Unity）的两车道高速公路路上研究一项静态障碍物。我们的控制算法从两个来源接收了一个随机反馈信号：模型驱动的部分，编码简单的驾驶规则，例如巷道保存和速度控制，以及随机，数据驱动的部分，结合了人类从驾驶数据中的人类专家知识。为了评估机器和人类驾驶之间的相似性，我们将轨道位置和速度的分布建模为高斯过程。我们证明我们的方法会导致类似人类的驾驶政策。

The technological and scientific challenges involved in the development of autonomous vehicles (AVs) are currently of primary interest for many automobile companies and research labs. However, human-controlled vehicles are likely to remain on the roads for several decades to come and may share with AVs the traffic environments of the future. In such mixed environments, AVs should deploy human-like driving policies and negotiation skills to enable smooth traffic flow. To generate automated human-like driving policies, we introduce a model-free, deep reinforcement learning approach to imitate an experienced human driver's behavior. We study a static obstacle avoidance task on a two-lane highway road in simulation (Unity). Our control algorithm receives a stochastic feedback signal from two sources: a model-driven part, encoding simple driving rules, such as lane-keeping and speed control, and a stochastic, data-driven part, incorporating human expert knowledge from driving data. To assess the similarity between machine and human driving, we model distributions of track position and speed as Gaussian processes. We demonstrate that our approach leads to human-like driving policies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题