论文标题

NVIDIA GPU的经过验证的指令级别的能耗测量

Verified Instruction-Level Energy Consumption Measurement for NVIDIA GPUs

论文作者

Arafa, Yehia, ElWazir, Ammar, ElKanishy, Abdelrahman, Aly, Youssef, Elsayed, Ayatelrahman, Badawy, Abdel-Hameed, Chennupati, Gopinath, Eidenbenz, Stephan, Santhi, Nandakishore

论文摘要

GPU在各个尺度的现代计算系统中都普遍存在。他们在这些系统中消耗了很大一部分能量。但是,供应商不会发布其内部微观结构的电力/能量开销的实际成本。在本文中,我们准确地测量了现代NVIDIA GPU中各种PTX指令的能源消耗。我们对四个不同世代的四个高端NVIDIA GPU(Maxwell,Pascal,Volta和Turing)进行了详尽的比较。此外,我们显示了CUDA编译器优化对每种说明能耗的影响。我们使用三种不同的软件技术来读取GPU芯片电源传感器,该技术使用NVIDIA的NVML API,并在这些技术之间进行了深入的比较。此外,我们针对定制设计的硬件功率测量验证了软件测量技术。结果表明,对于指令的不同类别,Volta GPU具有所有其他一代的最佳能源效率。这项工作应有助于理解NVIDIA GPU的微体系结构。它还应该对任何有效且准确的GPU内核进行能量测量。

GPUs are prevalent in modern computing systems at all scales. They consume a significant fraction of the energy in these systems. However, vendors do not publish the actual cost of the power/energy overhead of their internal microarchitecture. In this paper, we accurately measure the energy consumption of various PTX instructions found in modern NVIDIA GPUs. We provide an exhaustive comparison of more than 40 instructions for four high-end NVIDIA GPUs from four different generations (Maxwell, Pascal, Volta, and Turing). Furthermore, we show the effect of the CUDA compiler optimizations on the energy consumption of each instruction. We use three different software techniques to read the GPU on-chip power sensors, which use NVIDIA's NVML API and provide an in-depth comparison between these techniques. Additionally, we verified the software measurement techniques against a custom-designed hardware power measurement. The results show that Volta GPUs have the best energy efficiency of all the other generations for the different categories of the instructions. This work should aid in understanding NVIDIA GPUs' microarchitecture. It should also make energy measurements of any GPU kernel both efficient and accurate.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源