论文标题
合并学习 - 一种特定于域的无模型优化策略
Consolidated learning -- a domain-specific model-free optimization strategy with examples for XGBoost and MIMIC-IV
论文作者
论文摘要
对于许多机器学习模型,超参数的选择是实现高性能的关键一步。普遍的元学习方法集中于根据从先前任务获得的结果获得有限的计算预算,具有有限的计算预算的良好的超参数配置。本文提出了调整问题的新表述,称为合并学习,更适合模型开发人员面临的实践挑战,其中在类似的数据集中创建了大量的预测模型。在这种情况下,我们对总优化时间感兴趣,而不是为单个任务调整。我们表明,精心选择的超参数静态组合为任何时间优化产生良好的结果,并保持易于使用和实现。此外,我们指出了如何为特定域构建这样的投资组合。由于在相似任务之间更有效地传递了超参数配置,因此优化的改进是可能的。我们通过对XGBoost算法的实证研究以及从模拟物IV医学数据库中提取的预测任务的收集来证明这种方法的有效性;但是,合并学习适用于许多其他领域。
For many machine learning models, a choice of hyperparameters is a crucial step towards achieving high performance. Prevalent meta-learning approaches focus on obtaining good hyperparameters configurations with a limited computational budget for a completely new task based on the results obtained from the prior tasks. This paper proposes a new formulation of the tuning problem, called consolidated learning, more suited to practical challenges faced by model developers, in which a large number of predictive models are created on similar data sets. In such settings, we are interested in the total optimization time rather than tuning for a single task. We show that a carefully selected static portfolio of hyperparameters yields good results for anytime optimization, maintaining ease of use and implementation. Moreover, we point out how to construct such a portfolio for specific domains. The improvement in the optimization is possible due to more efficient transfer of hyperparameter configurations between similar tasks. We demonstrate the effectiveness of this approach through an empirical study for XGBoost algorithm and the collection of predictive tasks extracted from the MIMIC-IV medical database; however, consolidated learning is applicable in many others fields.