大规模系统延迟延迟的工作并行性模型

论文标题

大规模系统延迟延迟的工作并行性模型

A Model of Job Parallelism for Latency Reduction in Large-Scale Systems

论文作者

Ganesh, Ayalvadi, Mukhopadhyay, Arpan

论文摘要

在许多现实世界应用中，在多个处理核心上处理计算密集型工作是必不可少的。在本文中，我们考虑了一个理想的工作模型，在该模型中，可以通过$ d $不同的服务器同时提供工作。当$ d $服务器在其上完成的总工作量等于其大小时，该作业被认为是完整的。我们研究并行性对工作平均延迟的影响。具体而言，我们分析了一个由$ n $并行处理器共享服务器组成的系统，在该系统中，根据费率的泊松过程$nλ$（$λ<1 $）到达工作，每个工作都会带有指数分配的工作量，具有单位平均值。到达后，作业会随机选择$ D $服务器，并同时加入所有选定的服务器。我们通过平均场分析表明，对于固定的$ d \ geq 2 $和大$ n $，服务器的平均占用率为$ o（\ log（1/（1-λ）））$作为$λ\至1 $，相比之下，与$ o（1/（1--λ））$ d = 1 $相比。因此，我们通过并行性获得了工作响应时间的指数减少。我们在严格地证明平均场分析方面取得了重大进展。

Processing computation-intensive jobs at multiple processing cores in parallel is essential in many real-world applications. In this paper, we consider an idealised model for job parallelism in which a job can be served simultaneously by $d$ distinct servers. The job is considered complete when the total amount of work done on it by the $d$ servers equals its size. We study the effect of parallelism on the average delay of jobs. Specifically, we analyze a system consisting of $n$ parallel processor sharing servers in which jobs arrive according to a Poisson process of rate $n λ$ ($λ<1$) and each job brings an exponentially distributed amount of work with unit mean. Upon arrival, a job selects $d$ servers uniformly at random and joins all the chosen servers simultaneously. We show by a mean-field analysis that, for fixed $d \geq 2$ and large $n$, the average occupancy of servers is $O(\log (1/(1-λ)))$ as $λ\to 1$ in comparison to $O(1/(1-λ))$ average occupancy for $d=1$. Thus, we obtain an exponential reduction in the response time of jobs through parallelism. We make significant progress towards rigorously justifying the mean-field analysis.

下载PDF全文

下载文献需遵守相关版权规定

论文标题