论文标题
SECUER:单细胞RNA-seq数据的超快,可扩展和准确的聚类
Secuer: ultrafast, scalable and accurate clustering of single-cell RNA-seq data
论文作者
论文摘要
识别细胞簇是单细胞转录组学研究的关键步骤。尽管最近开发了大量的聚类工具,但SCRNA-Seq体积的快速增长促使人们采用了更多(计算)有效的聚类方法。在这里,我们介绍了Secuer,这是SCRNA-SEQ数据的可扩展有效的光谱聚类算法。通过采用基于锚固的双分图表示算法,Secuer可以通过数量级来减少运行时和内存使用量,尤其是对于超过1个甚至1000万个单元格的超大数据集谱图。同时,与小型和中等基准数据集中的竞争方法相比,Secuer的精度也更好或可比的精度。此外,我们展示了Secuer还可以充当新的共识聚类方法Secuer-Consensus的基础,该方法再次大大提高了最先进的共识聚类方法的运行时和可扩展性,同时还可以保持准确性。总体而言,Secuer是一个多功能,准确且可扩展的聚类框架,适用于小型至超大单细胞聚类任务。
Identifying cell clusters is a critical step for single-cell transcriptomics study. Despite the numerous clustering tools developed recently, the rapid growth of scRNA-seq volumes prompts for a more (computationally) efficient clustering method. Here, we introduce Secuer, a Scalable and Efficient speCtral clUstERing algorithm for scRNA-seq data. By employing an anchor-based bipartite graph representation algorithm, Secuer enjoys reduced runtime and memory usage by orders of magnitude, especially for ultra-large datasets profiling over 1 or even 10 million cells. Meanwhile, Secuer also achieves better or comparable accuracy than competing methods in small and moderate benchmark datasets. Furthermore, we showcase that Secuer can also serve as a building block for a new consensus clustering method, Secuer-consensus, which again greatly improves the runtime and scalability of state-of-the-art consensus clustering methods while also maintaining the accuracy. Overall, Secuer is a versatile, accurate, and scalable clustering framework suitable for small to ultra-large single-cell clustering tasks.