论文标题
大提琴:有效的计算机系统优化,可预测性早期终止和审查回归
Cello: Efficient Computer Systems Optimization with Predictive Early Termination and Censored Regression
论文作者
论文摘要
样品有效的机器学习(SEML)已被广泛应用,以找到可配置的计算机系统的最佳延迟和功率折衷。 SEML不是从配置空间中随机采样,而是通过大大减少必须采样以优化系统目标(例如,低延迟或能量)来降低搜索成本。然而,SEML仅降低成本的一个组成部分 - 收集的样本总数 - 但并不能降低收集每个样本的成本。至关重要的是,并非所有样品都是相等的。有些需要更长的时间来收集,因为它们对应于缓慢的系统配置。本文提出了大提琴,这是一种计算机系统优化框架,可降低样本收集成本,尤其是来自最慢的配置的框架。关键的见解是提前预测样品的系统行为是否会较差(例如,长期延迟或高能量),并在测量系统行为之前终止这些样本超过终止阈值,我们称其称为预测性的早期终止。为了在表现为高运行时或能量之前准确地预测未来的系统行为,大提琴使用审查的回归来产生对运行样本的准确预测。我们通过优化Apache Spark Worktoss的延迟和能量来评估大提琴。我们为大提琴提供了固定的时间来搜索硬件和软件配置参数的组合空间。我们的评估表明,与计算机系统优化中最新的SEML方法相比,大提琴在功率约束下的延迟最小化,并提高了延迟,并提高了能源,以提高能量1.18倍,以最大程度地减少延迟约束的能量。
Sample-efficient machine learning (SEML) has been widely applied to find optimal latency and power tradeoffs for configurable computer systems. Instead of randomly sampling from the configuration space, SEML reduces the search cost by dramatically reducing the number of configurations that must be sampled to optimize system goals (e.g., low latency or energy). Nevertheless, SEML only reduces one component of cost -- the total number of samples collected -- but does not decrease the cost of collecting each sample. Critically, not all samples are equal; some take much longer to collect because they correspond to slow system configurations. This paper present Cello, a computer systems optimization framework that reduces sample collection costs -- especially those that come from the slowest configurations. The key insight is to predict ahead of time whether samples will have poor system behavior (e.g., long latency or high energy) and terminate these samples early before their measured system behavior surpasses the termination threshold, which we call it predictive early termination. To predict the future system behavior accurately before it manifests as high runtime or energy, Cello uses censored regression to produces accurate predictions for running samples. We evaluate Cello by optimizing latency and energy for Apache Spark workloads. We give Cello a fixed amount of time to search a combined space of hardware and software configuration parameters. Our evaluation shows that compared to the state-of-the-art SEML approach in computer systems optimization, Cello improves latency by 1.19X for minimizing latency under a power constraint, and improves energy by 1.18X for minimizing energy under a latency constraint.