片上系统的深入增强学习：神话和现实

论文标题

片上系统的深入增强学习：神话和现实

Deep Reinforcement Learning for System-on-Chip: Myths and Realities

论文作者

Sung, Tegg Taekyong, Ryu, Bo

论文摘要

基于深度强化学习（DRL）的神经调度程序显示了解决现实世界资源分配问题的巨大潜力，因为它们在集群计算领域表现出了显着的性能增长。在本文中，我们通过广泛的实验和与非神经，启发式调度程序进行比较，调查了神经调度程序对芯片（SOC）资源分配域（SOC）资源域的可行性。关键发现是三倍。首先，由于i）SOC计算资源的异质性和ii）由传入工作中的随机性引起的可变动作集，因此设计用于群集计算域的神经调度程序对SOC无法正常工作。其次，我们的新型神经调度程序技术，折衷的相互作用匹配（EIM）克服了上述挑战，从而显着改善了现有的神经调度程序。具体而言，我们合理化了基于EIM的神经调度程序的性能增长背后的根本原因。第三，我们发现平均处理元件（PE）切换延迟和平均PE计算时间的比率也会显着影响神经SOC调度程序的性能，即使EIM也是如此。因此，未来的神经SOC调度程序设计必须考虑该指标及其实施开销，以实施实用程序。

Neural schedulers based on deep reinforcement learning (DRL) have shown considerable potential for solving real-world resource allocation problems, as they have demonstrated significant performance gain in the domain of cluster computing. In this paper, we investigate the feasibility of neural schedulers for the domain of System-on-Chip (SoC) resource allocation through extensive experiments and comparison with non-neural, heuristic schedulers. The key finding is three-fold. First, neural schedulers designed for cluster computing domain do not work well for SoC due to i) heterogeneity of SoC computing resources and ii) variable action set caused by randomness in incoming jobs. Second, our novel neural scheduler technique, Eclectic Interaction Matching (EIM), overcomes the above challenges, thus significantly improving the existing neural schedulers. Specifically, we rationalize the underlying reasons behind the performance gain by the EIM-based neural scheduler. Third, we discover that the ratio of the average processing elements (PE) switching delay and the average PE computation time significantly impacts the performance of neural SoC schedulers even with EIM. Consequently, future neural SoC scheduler design must consider this metric as well as its implementation overhead for practical utility.

下载PDF全文

下载文献需遵守相关版权规定

论文标题