使用深厚的增强学习，一种用于联合匹配，定价和调度的无分布式乘车共享方法

论文标题

使用深厚的增强学习，一种用于联合匹配，定价和调度的无分布式乘车共享方法

A Distributed Model-Free Ride-Sharing Approach for Joint Matching, Pricing, and Dispatching using Deep Reinforcement Learning

论文作者

Haliem, Marina, Mani, Ganapathy, Aggarwal, Vaneet, Bhargava, Bharat

论文摘要

乘车共享服务的重大发展提供了很多机会，可以通过提供个性化和方便的运输方式来改变城市流动性，同时确保大规模乘车齐全的效率。但是，此类服务的核心问题是每个驱动程序的路由计划，以满足给定限制的同时满足动态到达的请求。当前的模型大多限于静态路线，每辆车只有两次骑行（最佳）或三个骑行（具有启发式方法）。在本文中，我们提出了一个动态，需求意识和定价的基于车辆的乘客匹配和路线计划框架，该框架（1）根据在线需求，与每个行驶，车辆能力和位置相关的定价，动态生成每辆车的最佳路线。这种匹配的算法开始贪婪地开始并使用插入操作随着时间的推移进行优化，（2）通过允许他们根据特定乘车的预期奖励以及未来乘车的目的地提出不同的价格来参与决策过程，这受到供应和需求的影响。拼车偏好和（4）基于需求预测，我们的方法通过将其派遣到预期的高需求的领域，使用深厚的增强学习（RL）来重新平衡闲置车辆。使用纽约市出租车公共数据集对我们的框架进行了验证；但是，我们考虑了不同的车辆类型，并设计了客户实用功能来验证设置并研究不同的设置。实验结果表明了我们在实时和大规模设置中方法的有效性。

Significant development of ride-sharing services presents a plethora of opportunities to transform urban mobility by providing personalized and convenient transportation while ensuring efficiency of large-scale ride pooling. However, a core problem for such services is route planning for each driver to fulfill the dynamically arriving requests while satisfying given constraints. Current models are mostly limited to static routes with only two rides per vehicle (optimally) or three (with heuristics). In this paper, we present a dynamic, demand aware, and pricing-based vehicle-passenger matching and route planning framework that (1) dynamically generates optimal routes for each vehicle based on online demand, pricing associated with each ride, vehicle capacities and locations. This matching algorithm starts greedily and optimizes over time using an insertion operation, (2) involves drivers in the decision-making process by allowing them to propose a different price based on the expected reward for a particular ride as well as the destination locations for future rides, which is influenced by supply-and demand computed by the Deep Q-network, (3) allows customers to accept or reject rides based on their set of preferences with respect to pricing and delay windows, vehicle type and carpooling preferences, and (4) based on demand prediction, our approach re-balances idle vehicles by dispatching them to the areas of anticipated high demand using deep Reinforcement Learning (RL). Our framework is validated using the New York City Taxi public dataset; however, we consider different vehicle types and designed customer utility functions to validate the setup and study different settings. Experimental results show the effectiveness of our approach in real-time and large scale settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题