使用Q-学习来充电热点无人机对飞行能源进行轨迹优化

论文标题

使用Q-学习来充电热点无人机对飞行能源进行轨迹优化

Trajectory Optimization of Flying Energy Sources using Q-Learning to Recharge Hotspot UAVs

论文作者

Hoseini, Sayed Amir, Hassan, Jahan, Bokani, Ayub, Kanhere, Salil S.

论文摘要

尽管无人机或无人机提供服务的商业用法越来越受欢迎，但它们对船上电池容量有限的依赖会阻碍其飞行时间和任务连续性。因此，开发用于浇灌无人机电池的原位电动传输解决方案有可能延长其任务持续时间。在本文中，我们研究了一种场景，其中将无人机部署为基站（UAV-BS），为地面节点提供无线热点服务，同时从飞行能源中收集无线能量。这些能源是配备无线电源传输设备（例如RF天线）的专业无人机（充电器或发射器无人机，TUAVS）。 TUAV具有调整其飞行路径以最大化能量传输的灵活性。随着无人机数量和环境复杂性的越来越多，有必要为TUAVS制定智能轨迹选择程序，以优化能量传递增益。在本文中，我们将TUAVS的轨迹优化建模为Markov决策过程（MDP）问题，并使用Q学习算法对其进行解决。仿真结果证实，基于Q学习的TUAV的优化轨迹优于两种基准策略，即随机路径计划和TUAV的静态盘旋。

Despite the increasing popularity of commercial usage of UAVs or drone-delivered services, their dependence on the limited-capacity on-board batteries hinders their flight-time and mission continuity. As such, developing in-situ power transfer solutions for topping-up UAV batteries have the potential to extend their mission duration. In this paper, we study a scenario where UAVs are deployed as base stations (UAV-BS) providing wireless Hotspot services to the ground nodes, while harvesting wireless energy from flying energy sources. These energy sources are specialized UAVs (Charger or transmitter UAVs, tUAVs), equipped with wireless power transmitting devices such as RF antennae. tUAVs have the flexibility to adjust their flight path to maximize energy transfer. With the increasing number of UAV-BSs and environmental complexity, it is necessary to develop an intelligent trajectory selection procedure for tUAVs so as to optimize the energy transfer gain. In this paper, we model the trajectory optimization of tUAVs as a Markov Decision Process (MDP) problem and solve it using Q-Learning algorithm. Simulation results confirm that the Q-Learning based optimized trajectory of the tUAVs outperforms two benchmark strategies, namely random path planning and static hovering of the tUAVs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题