论文标题

fireiron:GPU上高性能线性代数的调度语言

Fireiron: A Scheduling Language for High-Performance Linear Algebra on GPUs

论文作者

Hagedorn, Bastian, Elliott, Archibald Samuel, Barthels, Henrik, Bodik, Rastislav, Grover, Vinod

论文摘要

实现高性能GPU内核需要优化针对目标GPU体系结构的算法实现。充分使用计算和内存层次结构以及可用的专业硬件至关重要。当前,Cublas和Cudnn等供应商库提供了GPU算法的最佳性能实现。但是,图书馆程序员的任务非常具有挑战性:对于每种提供的算法,必须针对所有常用的架构,输入大小和不同的存储格式开发高性能实现。这些实现通常是作为优化的装配代码提供的,因为仅在此级别公开了关键性的构建特征。这阻止了甚至相同算法的不同实现之间的重复使用,因为简单的差异可能会对低级实现细节产生重大影响。在本文中,我们介绍了FireRiron,firiron,即DSL和编译器,该编译器允许将高性能GPU实现指定为简单且可重复使用的构建块的组成。我们展示了如何使用FireRiron优化矩阵乘法实现,即使在使用诸如Nivida Tensor核心等专业硬件时,也可以实现匹配手工编码的CUDA内核,并且超过了Cublas提供的最先进的实现。

Achieving high-performance GPU kernels requires optimizing algorithm implementations to the targeted GPU architecture. It is of utmost importance to fully use the compute and memory hierarchy, as well as available specialised hardware. Currently, vendor libraries like cuBLAS and cuDNN provide the best performing implementations of GPU algorithms. However the task of the library programmer is incredibly challenging: for each provided algorithm, high-performance implementations have to be developed for all commonly used architectures, input sizes, and different storage formats. These implementations are generally provided as optimized assembly code because performance-critical architectural features are only exposed at this level. This prevents reuse between different implementations of even the same algorithm, as simple differences can have major effects on low-level implementation details. In this paper we introduce Fireiron, a DSL and compiler which allows the specification of high-performance GPU implementations as compositions of simple and reusable building blocks. We show how to use Fireiron to optimize matrix multiplication implementations, achieving performance matching hand-coded CUDA kernels, even when using specialised hardware such as NIVIDA Tensor Cores, and outperforming state-of-the-art implementations provided by cuBLAS by more than 2x.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源