论文标题

大规模系统延迟延迟的工作并行性模型

A Model of Job Parallelism for Latency Reduction in Large-Scale Systems

论文作者

Ganesh, Ayalvadi, Mukhopadhyay, Arpan

论文摘要

在许多现实世界应用中,在多个处理核心上处理计算密集型工作是必不可少的。在本文中,我们考虑了一个理想的工作模型,在该模型中,可以通过$ d $不同的服务器同时提供工作。当$ d $服务器在其上完成的总工作量等于其大小时,该作业被认为是完整的。我们研究并行性对工作平均延迟的影响。具体而言,我们分析了一个由$ n $并行处理器共享服务器组成的系统,在该系统中,根据费率的泊松过程$nλ$($λ<1 $)到达工作,每个工作都会带有指数分配的工作量,具有单位平均值。到达后,作业会随机选择$ D $服务器,并同时加入所有选定的服务器。我们通过平均场分析表明,对于固定的$ d \ geq 2 $和大$ n $,服务器的平均占用率为$ o(\ log(1/(1-λ)))$作为$λ\至1 $,相比之下,与$ o(1/(1--λ))$ d = 1 $相比。因此,我们通过并行性获得了工作响应时间的指数减少。我们在严格地证明平均场分析方面取得了重大进展。

Processing computation-intensive jobs at multiple processing cores in parallel is essential in many real-world applications. In this paper, we consider an idealised model for job parallelism in which a job can be served simultaneously by $d$ distinct servers. The job is considered complete when the total amount of work done on it by the $d$ servers equals its size. We study the effect of parallelism on the average delay of jobs. Specifically, we analyze a system consisting of $n$ parallel processor sharing servers in which jobs arrive according to a Poisson process of rate $n λ$ ($λ<1$) and each job brings an exponentially distributed amount of work with unit mean. Upon arrival, a job selects $d$ servers uniformly at random and joins all the chosen servers simultaneously. We show by a mean-field analysis that, for fixed $d \geq 2$ and large $n$, the average occupancy of servers is $O(\log (1/(1-λ)))$ as $λ\to 1$ in comparison to $O(1/(1-λ))$ average occupancy for $d=1$. Thus, we obtain an exponential reduction in the response time of jobs through parallelism. We make significant progress towards rigorously justifying the mean-field analysis.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源