矩阵和张量的光谱学习

论文标题

矩阵和张量的光谱学习

Spectral Learning on Matrices and Tensors

论文作者

Janzamin, Majid, Ge, Rong, Kossaifi, Jean, Anandkumar, Anima

论文摘要

光谱方法一直是机器学习和科学计算等多个领域中的中流台。它们涉及找到某种频谱分解以获得可以捕获当前问题的重要结构的基础函数。最常见的光谱方法是主要成分分析（PCA）。它利用了数据协方差矩阵的最高特征向量，例如降低维度。此数据预处理步骤通常可以有效地将信号与噪声分开。用于矩阵的PCA和其他光谱技术有几个局限性。通过仅限于成对矩，它们有效地对基础数据进行了高斯近似，并且在具有导致非高斯性的隐藏变量的数据上失败。但是，在大多数数据集中，无法直接观察到的潜在影响，例如文档语料库中的主题或疾病的潜在原因。通过将光谱分解方法扩展到更高阶段，我们证明了有效学习广泛的潜在变量模型的能力。高阶矩可以用张量表示，并且与正当矩矩阵相比，它们可以编码更多的信息。更重要的是，张量分解可以吸收矩阵方法遗漏的潜在作用，例如独特地识别非正交组件。利用这些方面对于可证明的各种潜在可变模型的学习是富有成效的。我们还概述了设计有效张量分解方法的计算技术。我们介绍Tensorly，它具有一个简单的Python界面，用于表达张量操作。它具有一个灵活的后端系统，支持Numpy，Pytorch，Tensorflow和MXNet等，从而允许多GPU和CPU操作以及具有深度学习功能的无缝集成。

Spectral methods have been the mainstay in several domains such as machine learning and scientific computing. They involve finding a certain kind of spectral decomposition to obtain basis functions that can capture important structures for the problem at hand. The most common spectral method is the principal component analysis (PCA). It utilizes the top eigenvectors of the data covariance matrix, e.g. to carry out dimensionality reduction. This data pre-processing step is often effective in separating signal from noise. PCA and other spectral techniques applied to matrices have several limitations. By limiting to only pairwise moments, they are effectively making a Gaussian approximation on the underlying data and fail on data with hidden variables which lead to non-Gaussianity. However, in most data sets, there are latent effects that cannot be directly observed, e.g., topics in a document corpus, or underlying causes of a disease. By extending the spectral decomposition methods to higher order moments, we demonstrate the ability to learn a wide range of latent variable models efficiently. Higher-order moments can be represented by tensors, and intuitively, they can encode more information than just pairwise moment matrices. More crucially, tensor decomposition can pick up latent effects that are missed by matrix methods, e.g. uniquely identify non-orthogonal components. Exploiting these aspects turns out to be fruitful for provable unsupervised learning of a wide range of latent variable models. We also outline the computational techniques to design efficient tensor decomposition methods. We introduce Tensorly, which has a simple python interface for expressing tensor operations. It has a flexible back-end system supporting NumPy, PyTorch, TensorFlow and MXNet amongst others, allowing multi-GPU and CPU operations and seamless integration with deep-learning functionalities.

下载PDF全文

下载文献需遵守相关版权规定

论文标题