论文标题
公司:针对SLO的微服务的智能细粒资源管理框架
FIRM: An Intelligent Fine-Grained Resource Management Framework for SLO-Oriented Microservices
论文作者
论文摘要
现代面向用户的延迟敏感的Web服务包括许多分布式,互通的微服务,这些微服务有望简化软件开发和操作。但是,跨微服务的计算资源的多路复用在生产中仍然具有挑战性,因为共享资源的争论可能会导致违反用户请求的服务级目标(SLO)的延迟峰值。本文介绍了公司,这是一个智能的精细资源管理框架,可预测跨微服务的资源共享,以提高整体利用率。公司利用在线遥测数据和机器学习方法来适应(a)检测/本地化的微服务会导致SLO违规行为,(b)确定争夺中的低级资源,(c)采取动作来减轻SLO违规行为,通过动态repovisioning。四个微服务基准的实验表明,公司将SLO违规量减少了16倍,同时将所需的CPU限制降低了62%。此外,公司通过将尾部潜伏期降低到11倍来提高性能可预测性。
Modern user-facing latency-sensitive web services include numerous distributed, intercommunicating microservices that promise to simplify software development and operation. However, multiplexing of compute resources across microservices is still challenging in production because contention for shared resources can cause latency spikes that violate the service-level objectives (SLOs) of user requests. This paper presents FIRM, an intelligent fine-grained resource management framework for predictable sharing of resources across microservices to drive up overall utilization. FIRM leverages online telemetry data and machine-learning methods to adaptively (a) detect/localize microservices that cause SLO violations, (b) identify low-level resources in contention, and (c) take actions to mitigate SLO violations via dynamic reprovisioning. Experiments across four microservice benchmarks demonstrate that FIRM reduces SLO violations by up to 16x while reducing the overall requested CPU limit by up to 62%. Moreover, FIRM improves performance predictability by reducing tail latencies by up to 11x.