论文标题
部分可观测时空混沌系统的无模型预测
Multi-step Planning for Automated Hyperparameter Optimization with OptFormer
论文作者
论文摘要
随着机器学习渗透到越来越多的行业和模型变得更加昂贵和耗时的训练时,对有效的自动化超参数优化(HPO)的需求从未如此紧迫。基于多步规划的高参数优化方法通过更有效地平衡探索和剥削来提高近视替代方案的效率。但是,由于它们的技术复杂性和计算强度,这些方法的潜力尚未完全实现。在这项工作中,我们利用了基于变压器的自然语言交流超参数优化的最新进展来绕过这些障碍。我们建立在最近提出的Optformer之上,该Optformer既施放了超参数建议又是目标函数近似值作为自回归产生,从而使计划通过推出简单而有效。我们对在Optformer模型之上执行多步规划的不同策略进行了广泛的探索,以突出其在构建非莫金HPO策略中使用的潜力。
As machine learning permeates more industries and models become more expensive and time consuming to train, the need for efficient automated hyperparameter optimization (HPO) has never been more pressing. Multi-step planning based approaches to hyperparameter optimization promise improved efficiency over myopic alternatives by more effectively balancing out exploration and exploitation. However, the potential of these approaches has not been fully realized due to their technical complexity and computational intensity. In this work, we leverage recent advances in Transformer-based, natural-language-interfaced hyperparameter optimization to circumvent these barriers. We build on top of the recently proposed OptFormer which casts both hyperparameter suggestion and target function approximation as autoregressive generation thus making planning via rollouts simple and efficient. We conduct extensive exploration of different strategies for performing multi-step planning on top of the OptFormer model to highlight its potential for use in constructing non-myopic HPO strategies.