论文标题
MREC:一个快速且通用的框架,用于对齐和匹配点云与应用到单细胞分子数据的应用
MREC: a fast and versatile framework for aligning and matching point clouds with applications to single cell molecular data
论文作者
论文摘要
比较和对齐大数据集是在许多不同知识领域中出现的普遍问题。我们介绍和研究MREC,这是一种用于计算数据集之间匹配的递归分解算法。基本思想是分区数据,匹配分区,然后递归匹配每对确定的分区中的点。匹配本身是使用黑匣子匹配过程完成的,这些过程太昂贵了,无法在整个数据集上运行。使用匹配质量的绝对度量,该框架支持优化参数,包括分区过程和匹配算法。根据设计,MREC可以应用于极大的数据集。我们分析了何时可以期望它运行良好的过程,并通过将其应用于单细胞分子数据分析时产生的许多对齐问题来证明其灵活性和功率。
Comparing and aligning large datasets is a pervasive problem occurring across many different knowledge domains. We introduce and study MREC, a recursive decomposition algorithm for computing matchings between data sets. The basic idea is to partition the data, match the partitions, and then recursively match the points within each pair of identified partitions. The matching itself is done using black box matching procedures that are too expensive to run on the entire data set. Using an absolute measure of the quality of a matching, the framework supports optimization over parameters including partitioning procedures and matching algorithms. By design, MREC can be applied to extremely large data sets. We analyze the procedure to describe when we can expect it to work well and demonstrate its flexibility and power by applying it to a number of alignment problems arising in the analysis of single cell molecular data.