论文标题
测量诱导分类和回归树的功能数据
Measure Inducing Classification and Regression Trees for Functional Data
论文作者
论文摘要
我们提出了一种基于树的算法,用于在功能数据分析的上下文中用于分类和回归问题,该算法允许在节点级别利用表示表示和多重分配规则,从而减少了概括性错误,同时保留了树的解释性。这是通过通过约束的凸优化来学习加权功能$ l^{2} $空间来实现的,然后将其用于从输入函数中提取多个加权积分特征,以确定树的每个内部节点的二进制拆分。该方法旨在通过定义可以取决于特定问题的合适的分裂规则和损失功能来管理多个功能输入和/或输出,并且还可以与标量和分类数据结合使用,因为树与原始贪婪的购物车算法一起生长。我们专注于在一维域定义的标量值功能输入的情况,并通过模拟研究和四个现实世界应用来说明我们方法在分类和回归任务中的有效性。
We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis, which allows to leverage representation learning and multiple splitting rules at the node level, reducing generalization error while retaining the interpretability of a tree. This is achieved by learning a weighted functional $L^{2}$ space by means of constrained convex optimization, which is then used to extract multiple weighted integral features from the input functions, in order to determine the binary split for each internal node of the tree. The approach is designed to manage multiple functional inputs and/or outputs, by defining suitable splitting rules and loss functions that can depend on the specific problem and can also be combined with scalar and categorical data, as the tree is grown with the original greedy CART algorithm. We focus on the case of scalar-valued functional inputs defined on unidimensional domains and illustrate the effectiveness of our method in both classification and regression tasks, through a simulation study and four real world applications.