论文标题

最小二乘使用与异性误差的草图数据进行估算

Least Squares Estimation Using Sketched Data with Heteroskedastic Errors

论文作者

Lee, Sokbae, Ng, Serena

论文摘要

由于各种原因,研究人员可以使用大小$ M $的数据的草图,而不是尺寸$ n $的完整样本进行回归。本文考虑回归误差没有恒定方差而通常需要鲁棒标准误差以提供准确推断的情况。我们表明,使用随机投影概述的数据的估计将表现为“好像”错误是同性恋。通过随机抽样进行估计不会具有此属性。结果之所以出现,是因为在随机投影的情况下的草图估计值可以表示为退化$ u $统计量,在某些条件下,这些统计数据与同性恋方差均不正常。我们验证条件不仅在协变量是外源时最小二乘回归的情况下所保持的,而且在协变量是内源性时的仪器变量估计中。结果意味着,如果适当选择草图方案,包括仪器相关性的第一阶段F测试,包括仪器相关性的第一阶段F测试可能更简单。

Researchers may perform regressions using a sketch of data of size $m$ instead of the full sample of size $n$ for a variety of reasons. This paper considers the case when the regression errors do not have constant variance and heteroskedasticity robust standard errors would normally be needed for test statistics to provide accurate inference. We show that estimates using data sketched by random projections will behave `as if' the errors were homoskedastic. Estimation by random sampling would not have this property. The result arises because the sketched estimates in the case of random projections can be expressed as degenerate $U$-statistics, and under certain conditions, these statistics are asymptotically normal with homoskedastic variance. We verify that the conditions hold not only in the case of least squares regression when the covariates are exogenous, but also in instrumental variables estimation when the covariates are endogenous. The result implies that inference, including first-stage F tests for instrument relevance, can be simpler than the full sample case if the sketching scheme is appropriately chosen.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源