基于强化学习的方法，用于大规模MIMO雷达中的多目标检测

论文标题

基于强化学习的方法，用于大规模MIMO雷达中的多目标检测

A Reinforcement Learning based approach for Multi-target Detection in Massive MIMO radar

论文作者

Ahmed, Aya Mostafa, Ahmad, Alaa Alameer, Fortunati, Stefano, Sezgin, Aydin, Greco, Maria S., Gini, Fulvio

论文摘要

本文考虑了大量多重输入多重输出（MMIMO）认知雷达（CR）的多目标检测问题。 CR的概念是基于感知行动周期的，该周期感知并智能适应动态环境，以最佳地满足特定的任务。但是，这通常需要对环境模型的先验知识，在大多数情况下，这是不可用的。我们建议在存在未知干扰统计的情况下，基于增强学习（RL）算法，用于认知多目标检测。雷达充当连续感知未知环境（即目标和干扰）的试剂，因此优化了传输的波形，以最大程度地提高检测的概率（$ p_ \ mathsf {d} $），通过将能量集中在特定范围内的细胞中（即，亮相）。此外，我们为波束成型优化问题提出了一种解决方案，其复杂性不如现有方法。进行数值模拟以评估固定和动态环境中提出的基于RL的算法的性能。将基于RL的光束形成与具有相同功率分配的常规全向方法进行比较，并且无RL的适应性光束形成。正如提出的数值结果所强调的那样，我们基于RL的波束形式在目标检测性能方面都优于两种方法。在环境恶劣的条件（例如低SNR，重尾干扰和迅速变化的情况）的情况下，性能的提高甚至特别显着。

This paper considers the problem of multi-target detection for massive multiple input multiple output (MMIMO) cognitive radar (CR). The concept of CR is based on the perception-action cycle that senses and intelligently adapts to the dynamic environment in order to optimally satisfy a specific mission. However, this usually requires a priori knowledge of the environmental model, which is not available in most cases. We propose a reinforcement learning (RL) based algorithm for cognitive multi-target detection in the presence of unknown disturbance statistics. The radar acts as an agent that continuously senses the unknown environment (i.e., targets and disturbance) and consequently optimizes transmitted waveforms in order to maximize the probability of detection ($P_\mathsf{D}$) by focusing the energy in specific range-angle cells (i.e., beamforming). Furthermore, we propose a solution to the beamforming optimization problem with less complexity than the existing methods. Numerical simulations are performed to assess the performance of the proposed RL-based algorithm in both stationary and dynamic environments. The RL based beamforming is compared to the conventional omnidirectional approach with equal power allocation and to adaptive beamforming with no RL. As highlighted by the proposed numerical results, our RL-based beamformer outperforms both approaches in terms of target detection performance. The performance improvement is even particularly remarkable under environmentally harsh conditions such as low SNR, heavy-tailed disturbance and rapidly changing scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题