论文标题
近似大数据集的P均值曲线
Approximating The p-Mean Curve of Large Data-Sets
论文作者
论文摘要
一组分段线性功能,称为polylines,$ p_1,\ ldots,p_l $,每个最多都可以将$ n $ dertices简化为带有$ k $ dertices的polyline $ m $,这样$ε_1,\ ldots, + ldots,ε__l$,每个polyline $ ldots,ε_l$ to youse $ ld_ $ ld_ $ l_ l_p $ l_p $ l_p $ l_p。我们将$ m $称为$ l_p $,带有$ p \ geq 1 $ a $ p $ -MEAN曲线($ P $ -MC)。 我们讨论了$ p \ geq 1 $,其中$ l_p $ dange满足了三角形不平等,而$ p $ -Mean以前尚未针对大多数值$ p $讨论。计算$ p $ -MEAN PELLINE系列是$ L =ω(1)$和$ P $的一些值,因此我们讨论近似算法。 我们给出$ O(n^2 \ log K)$ time Exact Exact算法,$ l = 2 $和$ p \ geq 1 $。此外,我们将Fréchet距离降低到离散的Fréchet距离,这为$ k $和$ε$增加了$ 2 $的因子$ 2 $。然后,我们使用精确的算法来找到$ 3 $ - approximation $ l> 2 $ in $ \ operatatorName {poly}(n,l)$ time。我们的方法基于Fréchet距离的自由空间图(FSD)的概括,以及用于近似摘要的可组合核心集。
A set of piecewise linear functions, called polylines, $P_1,\ldots,P_L$ each with at most $n$ vertices can be simplified into a polyline $M$ with $k$ vertices, such that the Fréchet distances $ε_1,\ldots,ε_L$ to each of these polylines are minimized under the $L_p$ distance. We call $M$ for $L_p$ with $p\geq 1$ a $p$-mean curve ($p$-MC). We discuss $p\geq 1$, for which $L_p$ distance satisfies the triangle inequality and $p$-mean has not been discussed before for most values $p$. Computing the $p$-mean polyline is NP-hard for $L=Ω(1)$ and some values of $p$, so we discuss approximation algorithms. We give a $O(n^2\log k)$ time exact algorithm for $L=2$ and $p\geq 1$. Also, we reduce the Fréchet distance to the discrete Fréchet distance which adds a factor $2$ to both $k$ and $ε$. Then we use our exact algorithm to find a $3$-approximation for $L>2$ in $\operatorname{poly}(n,L)$ time. Our method is based on a generalization of the free-space diagram (FSD) for Fréchet distance and composable core-sets for approximate summaries.