无人机路径设计的转移学习方法，具有连通性冲突约束

论文标题

无人机路径设计的转移学习方法，具有连通性冲突约束

A Transfer Learning Approach for UAV Path Design with Connectivity Outage Constraint

论文作者

Fontanesi, Gianluca, Zhu, Anding, Arvaneh, Mahnaz, Ahmadi, Hamed

论文摘要

连接感知的路径设计对于有效部署自动驾驶无人机（UAV）至关重要。最近，增强学习（RL）算法已成为解决这类复杂问题的流行方法，但是RL算法的收敛缓慢。在本文中，我们提出了一种转移学习（TL）方法，在该方法中，我们使用以前在旧领域接受过培训的教师政策来促进新领域中代理商的路径学习。随着探索过程和训练的继续，代理商根据随后与环境的相互作用来完善新域中的路径设计。我们评估了我们的方法，考虑到少数6 GHz的旧域以及毫米波（MMWave）的新域。以前在Sub 6 GHz路径训练的教师路径策略是解决连接感知路径问题的解决方案，我们将其作为约束的马尔可夫决策过程（CMDP）提出。我们采用基于Lyapunov的无模型深Q网络（DQN）来解决Sub6 GHz的路径设计，以确保连接性约束满意度。我们从经验上证明了我们方法对不同城市环境情景的有效性。结果表明，我们提出的方法能够在MMWave大大减少训练时间。

The connectivity-aware path design is crucial in the effective deployment of autonomous Unmanned Aerial Vehicles (UAVs). Recently, Reinforcement Learning (RL) algorithms have become the popular approach to solving this type of complex problem, but RL algorithms suffer slow convergence. In this paper, we propose a Transfer Learning (TL) approach, where we use a teacher policy previously trained in an old domain to boost the path learning of the agent in the new domain. As the exploration processes and the training continue, the agent refines the path design in the new domain based on the subsequent interactions with the environment. We evaluate our approach considering an old domain at sub-6 GHz and a new domain at millimeter Wave (mmWave). The teacher path policy, previously trained at sub-6 GHz path, is the solution to a connectivity-aware path problem that we formulate as a constrained Markov Decision Process (CMDP). We employ a Lyapunov-based model-free Deep Q-Network (DQN) to solve the path design at sub-6 GHz that guarantees connectivity constraint satisfaction. We empirically demonstrate the effectiveness of our approach for different urban environment scenarios. The results demonstrate that our proposed approach is capable of reducing the training time considerably at mmWave.

下载PDF全文

下载文献需遵守相关版权规定

论文标题