离散观察到的数据的功能主成分分析理论

论文标题

离散观察到的数据的功能主成分分析理论

Theory of functional principal component analysis for discretely observed data

论文作者

Zhou, Hang, Wei, Dongyi, Yao, Fang

论文摘要

功能数据分析是统计学中的重要研究领域，将数据视为从某些无限维功能空间中绘制的随机函数，基于特征分解的功能主成分分析（FPCA）在数据降低和表示方面起着核心作用。经过将近三十年的研究，仍然存在一个未解决的关键问题，即，协方差运算符的扰动分析，从嘈杂和离散观察到的数据中获得的特征组件分歧。这对于基于FPCA的模型和方法是基本的，而自霍尔，穆勒和王（Müllerand Wang（2006）的结果固定数量的本征函数估计结果以来，还没有实质性的进展。在这项工作中，我们旨在为这个问题建立一个统一的理论，在$ \ Mathcal {l}^2 $ and Ecremum Norms中获得具有不同索引的特征函数的上限，并得出了广泛采样方案的特征值的渐近分布。当$ \ Mathcal {l}^{2} $带有不同指数的特征功能估计的界限时，我们的结果提供了对现象的见解。最佳的特征功能估计值是最佳的，就像完全观察到曲线一样，并揭示了从非参数到与稀疏或致密采样相关的参数转变为参数的转变。我们还开发了一种双截断技术，以处理估计的协方差和本征函数的均匀收敛。这项工作中的技术论证对于处理具有嘈杂和离散观察到的功能数据的扰动序列很有用，并且可以应用于模型或基于FPCA作为正则化的涉及逆问题的扰动序列，例如功能线性回归。

Functional data analysis is an important research field in statistics which treats data as random functions drawn from some infinite-dimensional functional space, and functional principal component analysis (FPCA) based on eigen-decomposition plays a central role for data reduction and representation. After nearly three decades of research, there remains a key problem unsolved, namely, the perturbation analysis of covariance operator for diverging number of eigencomponents obtained from noisy and discretely observed data. This is fundamental for studying models and methods based on FPCA, while there has not been substantial progress since Hall, Müller and Wang (2006)'s result for a fixed number of eigenfunction estimates. In this work, we aim to establish a unified theory for this problem, obtaining upper bounds for eigenfunctions with diverging indices in both the $\mathcal{L}^2$ and supremum norms, and deriving the asymptotic distributions of eigenvalues for a wide range of sampling schemes. Our results provide insight into the phenomenon when the $\mathcal{L}^{2}$ bound of eigenfunction estimates with diverging indices is minimax optimal as if the curves are fully observed, and reveal the transition of convergence rates from nonparametric to parametric regimes in connection to sparse or dense sampling. We also develop a double truncation technique to handle the uniform convergence of estimated covariance and eigenfunctions. The technical arguments in this work are useful for handling the perturbation series with noisy and discretely observed functional data and can be applied in models or those involving inverse problems based on FPCA as regularization, such as functional linear regression.

下载PDF全文

下载文献需遵守相关版权规定

论文标题