论文标题
调谐机学习方法超级参数的离散模拟优化
Discrete Simulation Optimization for Tuning Machine Learning Method Hyperparameters
论文作者
论文摘要
机器学习(ML)方法用于大多数技术领域,例如图像识别,产品建议,财务分析,医学诊断和预测性维护。实施ML方法的一个重要方面涉及控制ML方法的学习过程,以最大程度地提高所考虑方法的性能。高参数调整是选择控制其学习过程的合适的ML方法参数的过程。在这项工作中,我们证明了使用离散仿真优化方法(例如排名和选择(R&S))以及随机搜索识别最大化ML方法性能的超参数集的随机搜索。具体而言,我们使用KN R&S方法和随机标尺随机搜索方法,以及其为此目的的变体之一。我们还构建了应用KN方法的理论基础,该方法通过解决方案空间枚举确定了具有统计保证的最佳解决方案。相比之下,随机标尺方法渐近地收敛到全局最佳选择,并产生较小的计算开销。我们演示了这些方法在各种机器学习模型中的应用,包括用于时间序列预测和图像分类的深神经网络模型。我们使用最先进的超参数优化库(例如$ herperopt $ and $芒果$)对这些方法进行了对这些方法的应用。 KN方法始终优于$ hyperopt $的随机搜索(RS)和Parzen估计量(TPE)方法。随机标尺方法的表现优于$ herperopt $ rs方法,并且相对于$ herperopt $的TPE方法和$芒果$算法提供统计上可比性的性能。
Machine learning (ML) methods are used in most technical areas such as image recognition, product recommendation, financial analysis, medical diagnosis, and predictive maintenance. An important aspect of implementing ML methods involves controlling the learning process for the ML method so as to maximize the performance of the method under consideration. Hyperparameter tuning is the process of selecting a suitable set of ML method parameters that control its learning process. In this work, we demonstrate the use of discrete simulation optimization methods such as ranking and selection (R&S) and random search for identifying a hyperparameter set that maximizes the performance of a ML method. Specifically, we use the KN R&S method and the stochastic ruler random search method and one of its variations for this purpose. We also construct the theoretical basis for applying the KN method, which determines the optimal solution with a statistical guarantee via solution space enumeration. In comparison, the stochastic ruler method asymptotically converges to global optima and incurs smaller computational overheads. We demonstrate the application of these methods to a wide variety of machine learning models, including deep neural network models used for time series prediction and image classification. We benchmark our application of these methods with state-of-the-art hyperparameter optimization libraries such as $hyperopt$ and $mango$. The KN method consistently outperforms $hyperopt$'s random search (RS) and Tree of Parzen Estimators (TPE) methods. The stochastic ruler method outperforms the $hyperopt$ RS method and offers statistically comparable performance with respect to $hyperopt$'s TPE method and the $mango$ algorithm.