有效的查询重新优化，并具有明智的子查询选择

论文标题

有效的查询重新优化，并具有明智的子查询选择

Efficient Query Re-optimization with Judicious Subquery Selections

论文作者

Zhao, Junyi, Zhang, Huanchen, Gao, Yihan

论文摘要

查询重优化是一种自适应查询处理技术，可在查询执行中的某些点重新启动优化器。目的是使用运行时收集的统计数据动态纠正基数估计错误，以调整查询计划以提高整体性能。我们确定了现有重优化算法中的一个关键弱点：它们的子查询部门和重视触发策略在很大程度上依赖于优化器的初始计划，这可以远离最佳。因此，我们提出了一种新型的重新挑选算法的Querysplit，它跳过了潜在的误导全球计划，而是直接从逻辑计划中产生子查询，作为基本的重新优化单元。通过开发优先执行较小“损坏”子征的成本函数，querysplit成功地推迟（有时避免）执行复杂的大型连接，以最大程度地提高其具有较小输入尺寸的可能性。我们在PostgreSQL中实现了Querysplit，并使用JOIN ORDER BENCHMARK将解决方案与四种最先进的重新优化算法进行了比较。我们的实验表明，与第二好的替代方案相比，querysplit将基准执行时间降低了35％。 querysplit和最佳优化器之间的性能差距在4％以内。

Query re-optimization is an adaptive query processing technique that re-invokes the optimizer at certain points in query execution. The goal is to dynamically correct the cardinality estimation errors using the statistics collected at runtime to adjust the query plan to improve the overall performance. We identify a key weakness in existing re-optimization algorithms: their subquery division and re-optimization trigger strategies rely heavily on the optimizer's initial plan, which can be far away from optimal. We, therefore, propose QuerySplit, a novel re-optimization algorithm that skips the potentially misleading global plan and instead generates subqueries directly from the logical plan as the basic re-optimization units. By developing a cost function that prioritizes the execution of less "damaging" subqueries, QuerySplit successfully postpones (sometimes avoids) the execution of complex large joins to maximize their probability of having smaller input sizes. We implemented QuerySplit in PostgreSQL and compared our solution against four state-of-the-art re-optimization algorithms using the Join Order Benchmark. Our experiments show that QuerySplit reduces the benchmark execution time by 35% compared to the second-best alternative. The performance gap between QuerySplit and an optimal optimizer is within 4%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题