强大的高维气免费多次测试

论文标题

强大的高维气免费多次测试

Robust High-dimensional Tuning Free Multiple Testing

论文作者

Fan, Jianqing, Lou, Zhipeng, Yu, Mengxin

论文摘要

高维数据的程式化特征是许多变量具有沉重的尾巴，鲁棒的统计推断对于有效的大规模统计推断至关重要。然而，现有的发展，例如温吸引，huberization和Menemen的中位数，需要限制的第二瞬间，并涉及可变依赖性的调谐参数，这阻碍了它们在对大规模问题的应用中的保真度。为了解放这些限制，本文从非反应的观点中重新审视了著名的Hodges-Lehmann（HL）估计器，以估算单样本问题和两样本问题的位置参数。我们的研究基于新开发的非反应巴哈杜尔代表制，开发了HL估计量的Berry-Esseen不平等和CRAMér类型中度偏差，并通过加权自举方法构建了数据驱动的置信区间。这些结果使我们能够将HL估计量扩展到大规模研究，并提出\ emph {无调}和\ emph {无矩}高维推理程序，用于测试全局空和大规模多重测试，并具有错误的发现比例控制。令人信服地表明，由此产生的无调和无矩的方法控制着规定级别的错误发现比例。仿真研究为我们发达的理论提供了进一步的支持。

A stylized feature of high-dimensional data is that many variables have heavy tails, and robust statistical inference is critical for valid large-scale statistical inference. Yet, the existing developments such as Winsorization, Huberization and median of means require the bounded second moments and involve variable-dependent tuning parameters, which hamper their fidelity in applications to large-scale problems. To liberate these constraints, this paper revisits the celebrated Hodges-Lehmann (HL) estimator for estimating location parameters in both the one- and two-sample problems, from a non-asymptotic perspective. Our study develops Berry-Esseen inequality and Cramér type moderate deviation for the HL estimator based on newly developed non-asymptotic Bahadur representation, and builds data-driven confidence intervals via a weighted bootstrap approach. These results allow us to extend the HL estimator to large-scale studies and propose \emph{tuning-free} and \emph{moment-free} high-dimensional inference procedures for testing global null and for large-scale multiple testing with false discovery proportion control. It is convincingly shown that the resulting tuning-free and moment-free methods control false discovery proportion at a prescribed level. The simulation studies lend further support to our developed theory.

下载PDF全文

下载文献需遵守相关版权规定

论文标题