论文标题
DDAC-SPAM:一种分布式算法,用于拟合具有特征分裂和去相关的高维稀疏添加模型
DDAC-SpAM: A Distributed Algorithm for Fitting High-dimensional Sparse Additive Models with Feature Division and Decorrelation
论文作者
论文摘要
分布式统计学习已成为大规模数据分析的流行技术。该领域的大多数现有工作都集中在分配观测值上,但是我们提出了一种新算法DDAC-SPAM,该算法将功能分配在高维稀疏添加模型下。我们的方法涉及三个步骤:鸿沟,脱波和征服。去相关操作使每个本地估计器能够恢复每个添加剂组件的稀疏模式,而不会对变量之间的相关结构施加严格的约束。通过理论分析和合成数据和实际数据的经验结果证明了所提出算法的有效性和效率。理论结果既包括一致的稀疏模式恢复以及每个加性功能组件的统计推断。我们的方法为拟合稀疏的添加剂模型提供了一种实用的解决方案,并在广泛的域中具有有希望的应用。
Distributed statistical learning has become a popular technique for large-scale data analysis. Most existing work in this area focuses on dividing the observations, but we propose a new algorithm, DDAC-SpAM, which divides the features under a high-dimensional sparse additive model. Our approach involves three steps: divide, decorrelate, and conquer. The decorrelation operation enables each local estimator to recover the sparsity pattern for each additive component without imposing strict constraints on the correlation structure among variables. The effectiveness and efficiency of the proposed algorithm are demonstrated through theoretical analysis and empirical results on both synthetic and real data. The theoretical results include both the consistent sparsity pattern recovery as well as statistical inference for each additive functional component. Our approach provides a practical solution for fitting sparse additive models, with promising applications in a wide range of domains.