论文标题

基于运输的功能方差分析和PCA用于协方差操作员

Transportation-Based Functional ANOVA and PCA for Covariance Operators

论文作者

Masarotto, Valentina, Panaretos, Victor M., Zemel, Yoav

论文摘要

我们考虑了比较几个随机过程相对于其二阶结构的样本的问题,并描述了该二阶结构中的主要变化模式(如果存在)。这些任务分别可以看作是方差分析(ANOVA)和协方差算子的主要组件分析(PCA)。它们自然出现在功能数据分析中,其中几个人群与围绕手段的分散性质而不是相对于他们本身的手段而言。我们基于最佳(多)运输的新方法,在其中可以通过相应的协方差的中心高斯过程来识别每个协方差。通过构建这些高斯过程的最佳同时耦合,我们将(线性)图与相对于标准诱导的距离的身份进行对比。被置换校准的最终测试统计量被认为明显胜过最先进,即使在当地替代方案下,也可以提供相当大的功率。这种效果被认为是真正的功能,并且与在无限维度中进行完美歧视的潜力有关。如果拒绝零假设规定平等,对传输图的几何解释使我们能够构建一个(切线空间)PCA,揭示了主要变化模式。作为开发我们方法的必要步骤,我们证明了最佳多转交图的存在和有限性的结果。这些对高斯过程的运输理论具有独立的兴趣。在各种模拟和真实的例子上说明了运输方差分析和PCA。

We consider the problem of comparing several samples of stochastic processes with respect to their second-order structure, and describing the main modes of variation in this second order structure, if present. These tasks can be seen as an Analysis of Variance (ANOVA) and a Principal Component Analysis (PCA) of covariance operators, respectively. They arise naturally in functional data analysis, where several populations are to be contrasted relative to the nature of their dispersion around their means, rather than relative to their means themselves. We contribute a novel approach based on optimal (multi)transport, where each covariance can be identified with a a centred Gaussian process of corresponding covariance. By means of constructing the optimal simultaneous coupling of these Gaussian processes, we contrast the (linear) maps that achieve it with the identity with respect to a norm-induced distance. The resulting test statistic, calibrated by permutation, is seen to distinctly outperform the state-of-the-art, and to furnish considerable power even under local alternatives. This effect is seen to be genuinely functional, and is related to the potential for perfect discrimination in infinite dimensions. In the event of a rejection of the null hypothesis stipulating equality, a geometric interpretation of the transport maps allows us to construct a (tangent space) PCA revealing the main modes of variation. As a necessary step to developing our methodology, we prove results on the existence and boundedness of optimal multitransport maps. These are of independent interest in the theory of transport of Gaussian processes. The transportation ANOVA and PCA are illustrated on a variety of simulated and real examples.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源