论文标题

稳定方法选择降低级别回归的方法

Stability Approach to Regularization Selection for Reduced-Rank Regression

论文作者

Wen, Canhong, Wang, Qin, Jiang, Yuan

论文摘要

降低的回归模型是处理多元响应和多个预测因子的流行模型,并广泛用于生物学,化学计量学,计量经济学,工程和其他领域。在减少的排列回归模型中,一个核心目标是估计代表预测多元响应的有效潜在因素的数量的系数矩阵的等级。尽管已经针对各种方法建立了理论结果,例如等级估计一致性,但实际上,等级确定仍然依赖于基于信息准则的方法,例如AIC和BIC或基于基本采样的方法,例如交叉验证。不幸的是,这些实际方法的理论特性在很大程度上未知。在本文中,我们提出了一种称为“星形RRR”的新方法,该方法选择了调谐参数,然后根据稳定性方法估算了降低级别回归的系数矩阵的等级。我们证明,恒星RRR达到等级估计一致性,即,由星星RRR选择的调谐参数估计的等级与真实等级一致。通过一项仿真研究,我们表明,恒星RRR优于其他调谐参数选择方法,包括AIC,BIC和交叉验证,因为它提供了最准确的估计等级。此外,当应用于乳腺癌数据集时,Stars-RRR发现了一定数量的遗传途径,这些遗传途径会影响DNA拷贝数变化,并导致与随机分类过程的其他方法相比,预测误差较小。

The reduced-rank regression model is a popular model to deal with multivariate response and multiple predictors, and is widely used in biology, chemometrics, econometrics, engineering, and other fields. In the reduced-rank regression modelling, a central objective is to estimate the rank of the coefficient matrix that represents the number of effective latent factors in predicting the multivariate response. Although theoretical results such as rank estimation consistency have been established for various methods, in practice rank determination still relies on information criterion based methods such as AIC and BIC or subsampling based methods such as cross validation. Unfortunately, the theoretical properties of these practical methods are largely unknown. In this paper, we present a novel method called StARS-RRR that selects the tuning parameter and then estimates the rank of the coefficient matrix for reduced-rank regression based on the stability approach. We prove that StARS-RRR achieves rank estimation consistency, i.e., the rank estimated with the tuning parameter selected by StARS-RRR is consistent to the true rank. Through a simulation study, we show that StARS-RRR outperforms other tuning parameter selection methods including AIC, BIC, and cross validation as it provides the most accurate estimated rank. In addition, when applied to a breast cancer dataset, StARS-RRR discovers a reasonable number of genetic pathways that affect the DNA copy number variations and results in a smaller prediction error than the other methods with a random-splitting process.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源