论文标题

迈向基于ML的自我调整数据库的一般框架

Towards a General Framework for ML-based Self-tuning Databases

论文作者

Schmied, Thomas, Didona, Diego, Döring, Andreas, Parnell, Thomas, Ioannou, Nikolas

论文摘要

机器学习(ML)方法最近已成为执行数据库自动参数调整的有效方法。最先进的方法包括贝叶斯优化(BO)和增强学习(RL)。在这项工作中,我们描述了将这些方法应用于在这种情况下尚未研究的数据库时的经验:FoundationDB。首先,我们描述了我们所面临的挑战,例如配置参数的未知有效范围以及导致无效运行的参数值的组合以及我们如何减轻它们。尽管通常会忽略这些问题,但我们认为它们是在数据库中采用ML自我调整技术的关键障碍,因此应受到研究社区的更多关注。其次,我们介绍使用ML方法调整FoundationDB时获得的实验结果。与该域中的先前工作不同,我们还将最简单的基线与随机搜索进行比较。我们的结果表明,尽管BO和RL方法可以将FoundationDB的吞吐量提高到38%,但随机搜索是一种高度竞争性的基线,发现一种构型仅比更复杂的ML方法差4%。我们得出的结论是,这一领域的未来工作可能希望更多地关注随机的,无模型的优化算法。

Machine learning (ML) methods have recently emerged as an effective way to perform automated parameter tuning of databases. State-of-the-art approaches include Bayesian optimization (BO) and reinforcement learning (RL). In this work, we describe our experience when applying these methods to a database not yet studied in this context: FoundationDB. Firstly, we describe the challenges we faced, such as unknown valid ranges of configuration parameters and combinations of parameter values that result in invalid runs, and how we mitigated them. While these issues are typically overlooked, we argue that they are a crucial barrier to the adoption of ML self-tuning techniques in databases, and thus deserve more attention from the research community. Secondly, we present experimental results obtained when tuning FoundationDB using ML methods. Unlike prior work in this domain, we also compare with the simplest of baselines: random search. Our results show that, while BO and RL methods can improve the throughput of FoundationDB by up to 38%, random search is a highly competitive baseline, finding a configuration that is only 4% worse than the, vastly more complex, ML methods. We conclude that future work in this area may want to focus more on randomized, model-free optimization algorithms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源