论文标题
分析深森林的树层结构
Analyzing the tree-layer structure of Deep Forests
论文作者
论文摘要
一方面,随机森林,另一方面,神经网络在机器学习社区中取得了巨大的预测性能。两者的组合都在文献中提出,特别是导致所谓的深森林(DF)(Zhou \&Feng,2019年)。在本文中,我们的目的不是基准DF表演,而是调查其潜在机制。此外,通常可以将DF体系结构简化为更简单和计算上的浅森林网络。尽管有些不稳定,后者可能会胜过标准预测树的方法。我们展示了一个理论框架,其中显示了一个浅树网络以增强古典决策树的性能。在这种情况下,我们为其多余的风险提供了紧密的理论下限和上限。这些理论上的结果表明,树木网络体系结构对结构良好的数据的兴趣,规定第一层充当数据编码器,足够丰富。
Random forests on the one hand, and neural networks on the other hand, have met great success in the machine learning community for their predictive performance. Combinations of both have been proposed in the literature, notably leading to the so-called deep forests (DF) (Zhou \& Feng,2019). In this paper, our aim is not to benchmark DF performances but to investigate instead their underlying mechanisms. Additionally, DF architecture can be generally simplified into more simple and computationally efficient shallow forest networks. Despite some instability, the latter may outperform standard predictive tree-based methods. We exhibit a theoretical framework in which a shallow tree network is shown to enhance the performance of classical decision trees. In such a setting, we provide tight theoretical lower and upper bounds on its excess risk. These theoretical results show the interest of tree-network architectures for well-structured data provided that the first layer, acting as a data encoder, is rich enough.