论文标题
基于重新采样的多层化推断高维回归
Resampling-Based Multisplit Inference for High-Dimensional Regression
论文作者
论文摘要
我们提出了一种基于重新采样的新方法,用于构建对高维线性回归中系数的任何假设的渐近精确测试。它可以嵌入到任何多个测试过程中,以对相关的预测变量发出置信度。该方法通过重复分配数据和可变选择技术来构建任何个人假设的置换测试统计量。然后,它通过适当地汇总其变量的测试统计数据来定义任何子集的测试。最终的过程非常灵活,因为它允许不同的选择技术和几种组合功能。我们以两种方式提出它:一种确切的方法和一种大约一种,它需要更少的内存使用和较短的计算时间,并且可以扩展到更高的维度。我们通过模拟和实际基因表达数据的分析说明了该方法的性能。
We propose a novel resampling-based method to construct an asymptotically exact test for any subset of hypotheses on coefficients in high-dimensional linear regression. It can be embedded into any multiple testing procedure to make confidence statements on relevant predictor variables. The method constructs permutation test statistics for any individual hypothesis by means of repeated splits of the data and a variable selection technique; then it defines a test for any subset by suitably aggregating its variables' test statistics. The resulting procedure is extremely flexible, as it allows different selection techniques and several combining functions. We present it in two ways: an exact method and an approximate one, that requires less memory usage and shorter computation time, and can be scaled up to higher dimensions. We illustrate the performance of the method with simulations and the analysis of real gene expression data.