Boosttree和Boostforest用于合奏学习

论文标题

Boosttree和Boostforest用于合奏学习

BoostTree and BoostForest for Ensemble Learning

论文作者

Zhao, Changming, Wu, Dongrui, Huang, Jian, Yuan, Ye, Zhang, Hai-Tao, Peng, Ruimin, Shi, Zhenhua

论文摘要

Bootstrap聚合（装袋）和提升是两种流行的合奏学习方法，它们结合了多个基础学习者，以生成一个复合模型，以实现更准确，更可靠的性能。它们已被广泛用于生物学，工程，医疗保健等。本文提出了Boostforest，这是一种使用Boostree作为基础学习者的合奏学习方法，可用于分类和回归。 Boostree通过梯度提升构建树模型。它通过在节点分裂时随机绘制剪切点来增加随机性（多样性）。 Boostforest通过引导训练数据来构建不同的Boosttrees，从而进一步提高了随机性。在35个分类和回归数据集上，Boostforest通常优于四种经典的集合学习方法（随机森林，超树，XGBoost和LightGBM）。值得注意的是，BoostForest通过简单地从参数池中随机采样来调整其参数，该参数可以很容易地指定，其合奏学习框架也可以用于结合许多其他基础学习者。

Bootstrap aggregating (Bagging) and boosting are two popular ensemble learning approaches, which combine multiple base learners to generate a composite model for more accurate and more reliable performance. They have been widely used in biology, engineering, healthcare, etc. This paper proposes BoostForest, which is an ensemble learning approach using BoostTree as base learners and can be used for both classification and regression. BoostTree constructs a tree model by gradient boosting. It increases the randomness (diversity) by drawing the cut-points randomly at node splitting. BoostForest further increases the randomness by bootstrapping the training data in constructing different BoostTrees. BoostForest generally outperformed four classical ensemble learning approaches (Random Forest, Extra-Trees, XGBoost and LightGBM) on 35 classification and regression datasets. Remarkably, BoostForest tunes its parameters by simply sampling them randomly from a parameter pool, which can be easily specified, and its ensemble learning framework can also be used to combine many other base learners.

下载PDF全文

下载文献需遵守相关版权规定

论文标题