论文标题

原型分析的一致性

Consistency of archetypal analysis

论文作者

Osting, Braxton, Wang, Dong, Xu, Yiming, Zosso, Dominique

论文摘要

原型分析是一种无监督的学习方法,它使用凸多接头来汇总多元数据。对于固定的$ K $,该方法找到了带有$ k $顶点的凸层,称为原型点,因此多层室包含在数据的凸壳中,数据和数据之间的平方平方距离很小。在本文中,我们证明了一个一致性结果,该结果表明数据是否是从具有有界支持的概率度量中独立采样的,然后将原型点收敛到问题的连续性版本的解决方案,我们识别并建立了几个属性。我们还在适当的分布假设下获得了最佳目标值的收敛速率。如果数据是从具有无限支持的分布中独立采样的,我们还证明了修改方法的一致性结果,该方法会惩罚原型点的分散体。我们的分析得到了从磁盘中均匀分布,正态分布,环形分布和高斯混合模型中均匀分布所采样的原型点的详细计算实验来支持的。

Archetypal analysis is an unsupervised learning method that uses a convex polytope to summarize multivariate data. For fixed $k$, the method finds a convex polytope with $k$ vertices, called archetype points, such that the polytope is contained in the convex hull of the data and the mean squared distance between the data and the polytope is minimal. In this paper, we prove a consistency result that shows if the data is independently sampled from a probability measure with bounded support, then the archetype points converge to a solution of the continuum version of the problem, of which we identify and establish several properties. We also obtain the convergence rate of the optimal objective values under appropriate assumptions on the distribution. If the data is independently sampled from a distribution with unbounded support, we also prove a consistency result for a modified method that penalizes the dispersion of the archetype points. Our analysis is supported by detailed computational experiments of the archetype points for data sampled from the uniform distribution in a disk, the normal distribution, an annular distribution, and a Gaussian mixture model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源