Fused-Pagerank：基于循环融合的近似Pagerank

论文标题

Fused-Pagerank：基于循环融合的近似Pagerank

FUSED-PAGERANK: Loop-Fusion based Approximate PageRank

论文作者

Jain, Shalini, Utkoor, Rahul, Eedi, Hemalatha, Peri, Sathya, Upadrasta, Ramakrishna

论文摘要

Pagerank是一个图形中心度度量，它在给定图中赋予每个节点的重要性。 Pagerank算法提供了重要的见解，以通过与其他节点形成的连接来理解节点的行为。这是一种迭代算法，将每次迭代中的节点排名直到所有节点值收敛。 Pagerank算法是使用稀疏存储格式实现的，这导致代码中的内存访问不规则。此关键特征抑制了优化以提高其性能，并使优化Pagerank算法成为一个非平凡的问题。在这项工作中，我们通过减少其不规则的内存访问来提高Pagerank算法的性能。在本文中，我们提出了Fused-Pagerank算法，这是一种以编译器优化为导向的近似技术，可减少Pagerank算法中不规则的内存访问次数的数量，同时改善其算法的收敛，从而更快地获得结果。特别是，我们提出了使用环融合的近似Pagerank算法。我们认为，我们的工作是第一批正式应用传统编译器优化技术来用于Pagerank算法中不规则内存访问的作品。我们已经通过在各种数据集上执行实验来验证我们的方法：法律图，快照数据集和合成数据集。在这些基准测试中，我们实现了2.05x，2.23x，1.74x的最大速度（Vs. -O3优化），具有顺序版本，〜4.4x，〜2.61x，〜4.22x，〜4.22x，fused-pagerank Algorithm的平行版与Pagerank Edcentric Seption的Pagerank Algerank Algorank Algorithm的比较。

PageRank is a graph centrality metric that gives the importance of each node in a given graph. The PageRank algorithm provides important insights to understand the behavior of nodes through the connections they form with other nodes. It is an iterative algorithm that ranks the nodes in each iteration until all the node values converge. The PageRank algorithm is implemented using sparse storage format, which results in irregular memory accesses in the code. This key feature inhibits optimizations to improve its performance, and makes optimizing the PageRank algorithm a non-trivial problem. In this work we improve the performance of PageRank algorithm by reducing its irregular memory accesses. In this paper, we propose FUSED-PAGERANK algorithm, a compiler optimization oriented approximate technique that reduces the number of irregular memory accesses in the PageRank algorithm, improving its locality while making the convergence of the algorithm faster with better accuracy in results. In particular, we propose an approximate PageRank algorithm using Loop-Fusion. We believe that ours is the first work that formally applies traditional compiler optimization techniques for irregular memory access in the PageRank algorithm. We have verified our method by performing experiments on a variety of datasets: LAW graphs, SNAP datasets and synthesized datasets. On these benchmarks, we have achieved a maximum speedup (vs. -O3 optimization) of 2.05X, 2.23X, 1.74X with sequential version, and ~4.4X, ~2.61X, ~4.22X with parallel version of FUSED-PAGERANK algorithm in comparison with Edge-centric version of PageRank algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题