通过合成轨迹数据生成估算道路网络中的链路流量：基于强化学习的方法

论文标题

通过合成轨迹数据生成估算道路网络中的链路流量：基于强化学习的方法

Estimating Link Flows in Road Networks with Synthetic Trajectory Data Generation: Reinforcement Learning-based Approaches

论文作者

Zhong, Miner, Kim, Jiwon, Zheng, Zuduo

论文摘要

本文通过结合有限的交通量和车辆轨迹数据来解决估计道路网络中链路流的问题。虽然循环检测器的流量量数据是链路流量估计的常见数据源，但检测器仅涵盖链接的子集。如今，还合并了从车辆跟踪传感器收集的车辆轨迹数据。但是，轨迹数据通常很少，因为观察到的轨迹仅代表了整个种群的一小部分，因为确切的采样率未知，并且可能在时空和时间上有所不同。这项研究提出了一个新颖的生成建模框架，在其中我们使用马尔可夫决策过程框架制定了车辆的链接到连接运动，作为顺序决策问题，并训练代理以做出顺序决策以生成逼真的合成车辆轨迹。我们使用加固学习（RL）的方法来找到代理的最佳行为，基于哪些合成人口车辆轨迹可生成以估算整个网络中的连接流。为了确保生成的人口车辆轨迹与观察到的交通量和轨迹数据一致，提出了两种基于逆强化学习和约束强化学习的方法。通过解决真实的道路网络中的链路流估计问题，通过这些基于这些基于RL的方法解决的提议的生成建模框架可以验证。此外，我们进行了全面的实验，以将性能与两种现有方法进行比较。结果表明，在现实情况下，所提出的框架具有更高的估计准确性和鲁棒性，在现实的情况下，未满足有关驾驶员的某些行为假设，或者轨迹数据的网络覆盖范围和渗透率较低。

This paper addresses the problem of estimating link flows in a road network by combining limited traffic volume and vehicle trajectory data. While traffic volume data from loop detectors have been the common data source for link flow estimation, the detectors only cover a subset of links. Vehicle trajectory data collected from vehicle tracking sensors are also incorporated these days. However, trajectory data are often sparse in that the observed trajectories only represent a small subset of the whole population, where the exact sampling rate is unknown and may vary over space and time. This study proposes a novel generative modelling framework, where we formulate the link-to-link movements of a vehicle as a sequential decision-making problem using the Markov Decision Process framework and train an agent to make sequential decisions to generate realistic synthetic vehicle trajectories. We use Reinforcement Learning (RL)-based methods to find the best behaviour of the agent, based on which synthetic population vehicle trajectories can be generated to estimate link flows across the whole network. To ensure the generated population vehicle trajectories are consistent with the observed traffic volume and trajectory data, two methods based on Inverse Reinforcement Learning and Constrained Reinforcement Learning are proposed. The proposed generative modelling framework solved by either of these RL-based methods is validated by solving the link flow estimation problem in a real road network. Additionally, we perform comprehensive experiments to compare the performance with two existing methods. The results show that the proposed framework has higher estimation accuracy and robustness under realistic scenarios where certain behavioural assumptions about drivers are not met or the network coverage and penetration rate of trajectory data are low.

下载PDF全文

下载文献需遵守相关版权规定

论文标题