AMLB：汽车基准测试

论文标题

AMLB：汽车基准测试

AMLB: an AutoML Benchmark

论文作者

Gijsbers, Pieter, Bueno, Marcos L. P., Coors, Stefan, LeDell, Erin, Poirier, Sébastien, Thomas, Janek, Bischl, Bernd, Vanschoren, Joaquin

论文摘要

众所周知，比较不同的汽车框架是具有挑战性的，并且经常做错了。我们引入了一个开放且可扩展的基准测试，该基准遵循最佳实践，并在比较自动框架时避免常见错误。我们对71个分类和33项回归任务进行了9个著名的汽车框架进行了详尽的比较。通过多面分析，评估模型的准确性，其与推理时间的权衡以及框架失败，探索了汽车框架之间的差异。我们还使用Bradley-terry树来发现相对自动框架排名不同的任务子集。该基准配备了一个开源工具，该工具与许多自动框架集成并自动化经验评估过程端到端：从框架安装和资源分配到深入评估。基准测试使用公共数据集，可以轻松地使用其他Automl框架和任务扩展，并且具有最新结果的网站。

Comparing different AutoML frameworks is notoriously challenging and often done incorrectly. We introduce an open and extensible benchmark that follows best practices and avoids common mistakes when comparing AutoML frameworks. We conduct a thorough comparison of 9 well-known AutoML frameworks across 71 classification and 33 regression tasks. The differences between the AutoML frameworks are explored with a multi-faceted analysis, evaluating model accuracy, its trade-offs with inference time, and framework failures. We also use Bradley-Terry trees to discover subsets of tasks where the relative AutoML framework rankings differ. The benchmark comes with an open-source tool that integrates with many AutoML frameworks and automates the empirical evaluation process end-to-end: from framework installation and resource allocation to in-depth evaluation. The benchmark uses public data sets, can be easily extended with other AutoML frameworks and tasks, and has a website with up-to-date results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题