基于大维数据的基于格拉曼尼亚扩散地图的尺寸缩小和分类

论文标题

基于大维数据的基于格拉曼尼亚扩散地图的尺寸缩小和分类

Grassmannian diffusion maps based dimension reduction and classification for high-dimensional data

论文作者

Santos, K. R. M. dos, Giovanis, D. G., Shields, M. D.

论文摘要

这项工作介绍了Grassmannian扩散图，这是一种新型的非线性降低降低技术，该技术通过其表示，将点之间的亲和力定义为与Grassmann歧管上点相对应的低维子空间。该方法是为应用程序设计的，例如图像识别和基于数据的高维数据分类，这些数据可以在较低的尺寸子空间中紧凑。 GDMAPS由两个阶段组成。第一个是一个点式线性维度降低，其中每个高维物体都映射到格拉斯曼。第二阶段是使用扩散图的多点非线性内核降低尺寸，以识别Grassmann歧管上点的子空间结构。为此，使用了适当的司格曼尼亚内核来构建在Grassmann歧管上连接点的图形上随机步行的过渡矩阵。过渡矩阵的光谱分析产生的低维晶状体扩散坐标将数据嵌入到低维繁殖的核Hilbert空间中。此外，基于降低的尺寸的过度词典的构建，其原子由格拉曼尼亚扩散坐标给出，开发了一种新颖的数据分类/识别技术。考虑了三个例子。首先，一个“玩具”示例表明，GDMAP可以识别单位球体上结构化点的适当参数化。第二个示例说明了GDMAPS揭示高维随机场数据的内在子空间结构的能力。在最后一个示例中，考虑到面部图像，面部识别问题被解决了面部图像，面部图像受到不同照明条件，面部表情变化和闭塞发生的情况。

This work introduces the Grassmannian Diffusion Maps, a novel nonlinear dimensionality reduction technique that defines the affinity between points through their representation as low-dimensional subspaces corresponding to points on the Grassmann manifold. The method is designed for applications, such as image recognition and data-based classification of high-dimensional data that can be compactly represented in a lower dimensional subspace. The GDMaps is composed of two stages. The first is a pointwise linear dimensionality reduction wherein each high-dimensional object is mapped onto the Grassmann. The second stage is a multi-point nonlinear kernel-based dimension reduction using Diffusion maps to identify the subspace structure of the points on the Grassmann manifold. To this aim, an appropriate Grassmannian kernel is used to construct the transition matrix of a random walk on a graph connecting points on the Grassmann manifold. Spectral analysis of the transition matrix yields low-dimensional Grassmannian diffusion coordinates embedding the data into a low-dimensional reproducing kernel Hilbert space. Further, a novel data classification/recognition technique is developed based on the construction of an overcomplete dictionary of reduced dimension whose atoms are given by the Grassmannian diffusion coordinates. Three examples are considered. First, a "toy" example shows that the GDMaps can identify an appropriate parametrization of structured points on the unit sphere. The second example demonstrates the ability of the GDMaps to reveal the intrinsic subspace structure of high-dimensional random field data. In the last example, a face recognition problem is solved considering face images subject to varying illumination conditions, changes in face expressions, and occurrence of occlusions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题