论文标题
Tree-Sne:使用T-SNE的分层聚类和可视化
Tree-SNE: Hierarchical Clustering and Visualization Using t-SNE
论文作者
论文摘要
T-SNE和分层聚类是探索性数据分析的流行方法,尤其是生物学。在加速T-SNE并获得较细粒的结构方面的最新进展为基础上,我们将两者结合起来创建Tree-Sne,这是一种基于堆叠的一维T-SNE嵌入的层次聚类和可视化算法。我们还介绍了alpha-Clustering,该簇建议使用最佳群集分配,而没有预见簇数的预见,该群集基于群集稳定性跨多个尺度。我们证明了树木和α-聚类对手写数字图像,来自血细胞的质量细胞术(Cytof)数据以及来自视网膜细胞的单细胞RNA-Sequencing(SCRNA-SEQ)数据的有效性。此外,为了证明可视化的有效性,我们使用alpha clustering在几个图像数据集上获得无监督的聚类结果与最先进的结果竞争。软件可在https://github.com/isaacrob/treesne上找到。
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in speeding up t-SNE and obtaining finer-grained structure, we combine the two to create tree-SNE, a hierarchical clustering and visualization algorithm based on stacked one-dimensional t-SNE embeddings. We also introduce alpha-clustering, which recommends the optimal cluster assignment, without foreknowledge of the number of clusters, based off of the cluster stability across multiple scales. We demonstrate the effectiveness of tree-SNE and alpha-clustering on images of handwritten digits, mass cytometry (CyTOF) data from blood cells, and single-cell RNA-sequencing (scRNA-seq) data from retinal cells. Furthermore, to demonstrate the validity of the visualization, we use alpha-clustering to obtain unsupervised clustering results competitive with the state of the art on several image data sets. Software is available at https://github.com/isaacrob/treesne.