论文标题

GEVO:使用进化计算的GPU代码优化

GEVO: GPU Code Optimization using Evolutionary Computation

论文作者

Liou, Jhe-Yu, Wang, Xiaodong, Forrest, Stephanie, Wu, Carole-Jean

论文摘要

GPU是机器学习和高性能计算中革命的关键推动者,它是事实上的协调员,以加速大规模计算。随着编程堆栈和工具支持的成熟,GPU也已成为程序员访问,他们可能缺乏对基础体系结构的详细知识,并且无法完全利用GPU的计算能力。 GEVO(使用进化计算的GPU优化)是一种工具,用于自动发现优化机会并调整LLVM表示中GPU内核的性能。 GEVO使用基于人群的搜索来查找编辑为LLVM-IR编辑的GPU代码,并在保留所需功能的同时,提高了所需标准的性能。我们证明,GEVO在Rodinia基准套件和机器学习模型,SVM和RESNET18上改善了GPU程序在NVIDIA TESLA P100上的执行时间。对于Rodinia的基准,GEVO将GPU内核运行时性能的平均提高了49.48%,并且比完全合理的基线基线提高了412%。如果将内核输出精度放松以耐受1%的误差,则GEVO可以找到优于基线版本的内核变体,平均比基线版本的差异为51.08%。对于机器学习工作负载,Gevo在MNIST手写识别(3.24X)和A9A收入预测(2.93倍)数据集上实现了SVM的内核性能提高,但不会损失模型准确性。 Gevo使用RESNET18/CIFAR-10在图像分类上实现1.79倍的内核性能改进,降低了模型精度少于1%。

GPUs are a key enabler of the revolution in machine learning and high performance computing, functioning as de facto co-processors to accelerate large-scale computation. As the programming stack and tool support have matured, GPUs have also become accessible to programmers, who may lack detailed knowledge of the underlying architecture and fail to fully leverage the GPU's computation power. GEVO (Gpu optimization using EVOlutionary computation) is a tool for automatically discovering optimization opportunities and tuning the performance of GPU kernels in the LLVM representation. GEVO uses population-based search to find edits to GPU code compiled to LLVM-IR and improves performance on desired criteria while retaining required functionality. We demonstrate that GEVO improves the execution time of the GPU programs in the Rodinia benchmark suite and the machine learning models, SVM and ResNet18, on NVIDIA Tesla P100. For the Rodinia benchmarks, GEVO improves GPU kernel runtime performance by an average of 49.48% and by as much as 412% over the fully compiler-optimized baseline. If kernel output accuracy is relaxed to tolerate up to 1% error, GEVO can find kernel variants that outperform the baseline version by an average of 51.08%. For the machine learning workloads, GEVO achieves kernel performance improvement for SVM on the MNIST handwriting recognition (3.24X) and the a9a income prediction (2.93X) datasets with no loss of model accuracy. GEVO achieves 1.79X kernel performance improvement on image classification using ResNet18/CIFAR-10, with less than 1% model accuracy reduction.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源