论文标题
正规化非线性回归的随机树合奏
Stochastic tree ensembles for regularized nonlinear regression
论文作者
论文摘要
本文为非线性回归开发了一种新型的随机树集合方法,我们称之为XBART,这是加速贝叶斯添加剂回归树的缩写。通过将贝叶斯建模的正则化和随机搜索策略与递归分区方法中的计算有效技术相结合,新方法可以达到最新的性能:在许多情况下,它比广泛使用的XGBoost算法更快,更准确。通过仔细的仿真研究,我们证明了我们的新方法可提供对平均功能的准确估计,并且比流行替代方案(例如BART,XGBOOST和NEARARENTENTS(使用Keras))更快。我们还证明了有关新算法的许多基本理论结果,包括模型的单个树版本的一致性以及集合版本产生的马尔可夫链的平稳性。此外,我们证明,在Xbart构造的树木上初始化标准的贝叶斯添加剂回归树马尔可夫链蒙特卡洛(MCMC)大大改善了可靠的间隔覆盖范围并减少了总运行时间。
This paper develops a novel stochastic tree ensemble method for nonlinear regression, which we refer to as XBART, short for Accelerated Bayesian Additive Regression Trees. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning approaches, the new method attains state-of-the-art performance: in many settings it is both faster and more accurate than the widely-used XGBoost algorithm. Via careful simulation studies, we demonstrate that our new approach provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost and neural networks (using Keras). We also prove a number of basic theoretical results about the new algorithm, including consistency of the single tree version of the model and stationarity of the Markov chain produced by the ensemble version. Furthermore, we demonstrate that initializing standard Bayesian additive regression trees Markov chain Monte Carlo (MCMC) at XBART-fitted trees considerably improves credible interval coverage and reduces total run-time.