通过深度加固学习

论文标题

通过深度加固学习

Joint Power Allocation and Beamformer for mmW-NOMA Downlink Systems by Deep Reinforcement Learning

论文作者

Akbarpour-Kasgari, Abbas, Ardebilipour, Mehrdad

论文摘要

下一代无线通信中对数据速率的高需求可以通过在毫米波（MMW）频带中的非正交多访问（NOMA）方法来确保。 MMW-NOMA系统的关节功率分配和波束形成是强制性的，可以通过优化方法来满足。为此，由于政策生成，我们利用了深入的加强学习（DRL）方法，从而导致了优化的用户总和。参与者批评现象用于衡量即时奖励，并提供新的动作，以最大程度地提高网络的整体Q值。立即奖励是根据两个用户对每个用户的最低保证利率以及消耗功率总和作为约束的总和来定义的。模拟结果代表了所提出的方法的优越性，而不是时间分段多重访问（TDMA）和另一种NOMA优化策略，以用户的总和率进行了优化。

The high demand for data rate in the next generation of wireless communication could be ensured by Non-Orthogonal Multiple Access (NOMA) approach in the millimetre-wave (mmW) frequency band. Joint power allocation and beamforming of mmW-NOMA systems is mandatory which could be met by optimization approaches. To this end, we have exploited Deep Reinforcement Learning (DRL) approach due to policy generation leading to an optimized sum-rate of users. Actor-critic phenomena are utilized to measure the immediate reward and provide the new action to maximize the overall Q-value of the network. The immediate reward has been defined based on the summation of the rate of two users regarding the minimum guaranteed rate for each user and the sum of consumed power as the constraints. The simulation results represent the superiority of the proposed approach rather than the Time-Division Multiple Access (TDMA) and another NOMA optimized strategy in terms of sum-rate of users.

下载PDF全文

下载文献需遵守相关版权规定

论文标题