论文标题
一般非参数回归的分位数匹配复合材料的全球偏见校正和构造
Global Bias-Corrected Divide-and-Conquer by Quantile-Matched Composite for General Nonparametric Regressions
论文作者
论文摘要
偏见校正和鲁棒性的问题对于分裂和纠纷策略(DC)至关重要,尤其是对于具有大量数据的不对称非参数模型。众所周知,基于分位数的方法可以实现鲁棒性,但是当误差分布不对称时,非参数回归的分位数估计具有不可忽视的偏差。本文通过分位数匹配的复合材料探索了全局偏置校正的直流,用于具有常规误差分布的非参数回归。提出的策略可以同时实现偏见纠正和鲁棒性。与使用相同的分位水平通过每个局部机器构建局部估计器的常见DC分位数不同,在新方法中,局部估计器在不同数据批次的各种分位水平上获得,然后将全局估计器精心构建为本地估计器的加权总和。在加权总和中,权重和分位水平匹配良好,使整体估计器的偏置得到了显着校正,尤其是对于误差分布不对称的情况。基于全局估计量的渐近特性,达到了最佳权重,然后建议使用相应的算法。新方法的行为通过模拟实验和实际数据分析的各种数值示例进一步说明。与竞争对手相比,新方法具有估计准确性,鲁棒性,适用性和计算效率的有利特征。
The issues of bias-correction and robustness are crucial in the strategy of divide-and-conquer (DC), especially for asymmetric nonparametric models with massive data. It is known that quantile-based methods can achieve the robustness, but the quantile estimation for nonparametric regression has non-ignorable bias when the error distribution is asymmetric. This paper explores a global bias-corrected DC by quantile-matched composite for nonparametric regressions with general error distributions. The proposed strategies can achieve the bias-correction and robustness, simultaneously. Unlike common DC quantile estimations that use an identical quantile level to construct a local estimator by each local machine, in the new methodologies, the local estimators are obtained at various quantile levels for different data batches, and then the global estimator is elaborately constructed as a weighted sum of the local estimators. In the weighted sum, the weights and quantile levels are well-matched such that the bias of the global estimator is corrected significantly, especially for the case where the error distribution is asymmetric. Based on the asymptotic properties of the global estimator, the optimal weights are attained, and the corresponding algorithms are then suggested. The behaviors of the new methods are further illustrated by various numerical examples from simulation experiments and real data analyses. Compared with the competitors, the new methods have the favorable features of estimation accuracy, robustness, applicability and computational efficiency.