论文标题
FPGA上的高带宽内存:数据分析观点
High Bandwidth Memory on FPGAs: A Data Analytics Perspective
论文作者
论文摘要
由于现代工作负载的需求以及随之而来的硬件专业化的必要性,数据中心中基于FPGA的数据处理正在增加。在这一趋势的驱动下,供应商正在迅速调整可重新配置的设备,以适应数据并计算密集的工作量。在FPGA设备中包含高带宽内存(HBM)是一个最近的例子。 HBM承诺克服带宽瓶颈,由于其面向吞吐量的设计,经常是基于FPGA的加速器面对的。在本文中,我们从数据分析的角度研究了HBM对FPGA的使用和好处。我们考虑三个工作负载,这些工作负载经常在以分析为导向的数据库中执行,并在FPGA上实现它们,以显示它们从HBM中受益:范围选择,哈希联接和随机梯度下降进行线性模型培训。我们将设计集成到柱状数据库(MONETDB)中,并显示与数据移动和分区相关的集成产生的权衡。在某些情况下,基于FPGA+HBM的解决方案能够超过2台功率9系统或14核XEONE 5的最高性能,最高1.8倍(选择),12.9倍(JOIN)和3.2倍(SGD)。
FPGA-based data processing in datacenters is increasing in popularity due to the demands of modern workloads and the ensuing necessity for specialization in hardware. Driven by this trend, vendors are rapidly adapting reconfigurable devices to suit data and compute intensive workloads. Inclusion of High Bandwidth Memory (HBM) in FPGA devices is a recent example. HBM promises overcoming the bandwidth bottleneck, faced often by FPGA-based accelerators due to their throughput oriented design. In this paper, we study the usage and benefits of HBM on FPGAs from a data analytics perspective. We consider three workloads that are often performed in analytics oriented databases and implement them on FPGA showing in which cases they benefit from HBM: range selection, hash join, and stochastic gradient descent for linear model training. We integrate our designs into a columnar database (MonetDB) and show the trade-offs arising from the integration related to data movement and partitioning. In certain cases, FPGA+HBM based solutions are able to surpass the highest performance provided by either a 2-socket POWER9 system or a 14-core XeonE5 by up to 1.8x (selection), 12.9x (join), and 3.2x (SGD).