单击基于贝叶斯超参数优化的集合学习管道来提高预测

论文标题

单击基于贝叶斯超参数优化的集合学习管道来提高预测

Click prediction boosting via Bayesian hyperparameter optimization based ensemble learning pipelines

论文作者

Demirel, Çağatay, Tokuç, A. Aylin, Tekin, Ahmet Tezcan

论文摘要

在线旅行社（OTA）的网站在元搜索竞标引擎上宣传。预测酒店将收到的投标金额收到的点击次数的问题是管理OTA在元搜索引擎上的广告活动的重要一步，因为竞标时间的点击次数定义了要生成的成本。在这项工作中指出了各种回归器，以提高点击预测性能。按照预处理程序，根据样品的记录日期，将功能集分为火车和测试组。然后，通过利用XGBoost进行数据收集的功能消除，从而大大降低了功能的维度。然后通过将贝叶斯高参数优化应用于XGBoost，LightGBM和SGD模型来找到最佳的超参数。分别测试了不同训练的模型，并组合形成整体模型。提出了四种替代合奏解决方案。相同的测试集用于测试单个和集合模型，46个模型组合的结果表明，堆栈集合模型产生了所有的R2分数。总之，整体模型将预测性能提高了约10％。

Online travel agencies (OTA's) advertise their website offers on meta-search bidding engines. The problem of predicting the number of clicks a hotel would receive for a given bid amount is an important step in the management of an OTA's advertisement campaign on a meta-search engine, because bid times number of clicks defines the cost to be generated. Various regressors are ensembled in this work to improve click prediction performance. Following the preprocessing procedures, the feature set is divided into train and test groups depending on the logging date of the samples. The data collection is then subjected to feature elimination via utilizing XGBoost, which significantly reduces the dimension of features. The optimum hyper-parameters are then found by applying Bayesian hyperparameter optimization to XGBoost, LightGBM, and SGD models. The different trained models are tested separately as well as combined to form ensemble models. Four alternative ensemble solutions have been suggested. The same test set is used to test both individual and ensemble models, and the results of 46 model combinations demonstrate that stack ensemble models yield the desired R2 score of all. In conclusion, the ensemble model improves the prediction performance by about 10%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题