论文标题
通过能量收集传感器,AOI最小化状态更新控制
AoI Minimization in Status Update Control with Energy Harvesting Sensors
论文作者
论文摘要
信息新鲜度对于时间关键的物联网应用至关重要,例如监视和控制系统。我们考虑具有多个用户,多个能源收集传感器和无线边缘节点的IoT状态更新系统。用户接收有关物理量的时间敏感信息,每个信息由传感器测量。用户将请求发送到Edge节点,在该节点中,缓存包含每个传感器最近收到的测量值。为了满足请求,边缘节点要么命令传感器发送状态更新,要么从缓存中检索老年测量。我们旨在找到边缘节点的最佳动作,以最大程度地减少服务测量的信息时代。我们将此问题建模为马尔可夫决策过程,并开发加强学习(RL)算法:基于模型的价值迭代和无模型Q学习方法。我们还为现实情况提出了一种Q学习方法,其中仅通过状态更新将边缘节点告知传感器电池级别。还解决了在传输限制下的案件。此外,分析表征了最佳策略的属性。仿真结果表明,最佳策略是基于阈值的策略,与几个基线相比,提出的RL方法大大降低了平均成本。
Information freshness is crucial for time-critical IoT applications, e.g., monitoring and control systems. We consider an IoT status update system with multiple users, multiple energy harvesting sensors, and a wireless edge node. The users receive time-sensitive information about physical quantities, each measured by a sensor. Users send requests to the edge node where a cache contains the most recently received measurements from each sensor. To serve a request, the edge node either commands the sensor to send a status update or retrieves the aged measurement from the cache. We aim at finding the best actions of the edge node to minimize the age of information of the served measurements. We model this problem as a Markov decision process and develop reinforcement learning (RL) algorithms: model-based value iteration and model-free Q-learning methods. We also propose a Q-learning method for the realistic case where the edge node is informed about the sensors' battery levels only via the status updates. The case under transmission limitations is also addressed. Furthermore, properties of an optimal policy are analytically characterized. Simulation results show that an optimal policy is a threshold-based policy and that the proposed RL methods significantly reduce the average cost compared to several baselines.