邻居加入偏见的组合和计算调查

论文标题

邻居加入偏见的组合和计算调查

Combinatorial and computational investigations of Neighbor-Joining bias

论文作者

Davidson, Ruth, del Campo, Abraham Martin

论文摘要

邻居加入算法是一种流行的基于距离的系统发育方法，该方法从生物学数据引起的差异图中计算了树度量。将差异图视为欧几里得空间中的点，算法将输入空间划分为由返回的树的组合类型所索引的多面体区域。尚未找到对这些地区的完整组合描述。邻居加入聚集事件的不同序列可以产生相同的组合树，因此将多个几何区域与相同的算法输出相关联。我们通过在树上定义集聚顺序来解决这种混乱，从而导致输出空间的不同区域与加权Motzkin路径之间进行两次培养。结果，我们仅根据分类单元的数量为多面体区域的数量提供一个公式。我们以这些多面体区域之间的计算比较结束，以揭示算法的任何实施中引入的偏差。

The Neighbor-Joining algorithm is a popular distance-based phylogenetic method that computes a tree metric from a dissimilarity map arising from biological data. Realizing dissimilarity maps as points in Euclidean space, the algorithm partitions the input space into polyhedral regions indexed by the combinatorial type of the trees returned. A full combinatorial description of these regions has not been found yet; different sequences of Neighbor-Joining agglomeration events can produce the same combinatorial tree, therefore associating multiple geometric regions to the same algorithmic output. We resolve this confusion by defining agglomeration orders on trees, leading to a bijection between distinct regions of the output space and weighted Motzkin paths. As a result, we give a formula for the number of polyhedral regions depending only on the number of taxa. We conclude with a computational comparison between these polyhedral regions, to unveil biases introduced in any implementation of the algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题