管道处理器中动态频率缩放的统一学习平台

论文标题

管道处理器中动态频率缩放的统一学习平台

A Unified Learning Platform for Dynamic Frequency Scaling in Pipelined Processors

论文作者

Ajirlou, Arash Fouman, Partin-Vaisband, Inna

论文摘要

提出了一个机器学习（ML）设计框架，用于根据单个指令的传播延迟动态调整时钟频率。对随机森林模型进行了训练，以实时将传播延迟分类，利用当前的操作类型，当前操作数和计算历史记录为ML特征。训练有素的模型是在Verilog中实现的，作为基线处理器中的附加管道阶段。在45 nm CMOS技术中，在栅极级别模拟了修改系统，通过粗粒度的ML分类，速度为68％，能量降低了37％。以额外的能源成本以较小的粒度证明了95％的加速度。

A machine learning (ML) design framework is proposed for dynamically adjusting clock frequency based on propagation delay of individual instructions. A Random Forest model is trained to classify propagation delays in real-time, utilizing current operation type, current operands, and computation history as ML features. The trained model is implemented in Verilog as an additional pipeline stage within a baseline processor. The modified system is simulated at the gate-level in 45 nm CMOS technology, exhibiting a speed-up of 68% and energy reduction of 37% with coarse-grained ML classification. A speed-up of 95% is demonstrated with finer granularities at additional energy costs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题