论文标题
在线算法,用于查找具有长度和多个前缀和后缀条件的独特子字符串
Online algorithms for finding distinct substrings with length and multiple prefix and suffix conditions
论文作者
论文摘要
让两个静态序列的字符串$ p $和$ s $分别代表前缀和后缀条件,作为预处理的输入。在查询中,给出两个正整数$ k_1 $和$ k_2 $,以及以在线方式给出的字符串$ t $,这样$ t_i $代表$ t $ $ t $的长度-1 \ leq i \ leq i \ leq i \ leq i \ leq | t | $。在本文中,我们有兴趣计算$ \ mathit {ans_i} $ of Difints substrings $ w $的$ t_i $,以便$ k_1 \ leq | w | \ leq k_2 $,$ w $包含p $的一些$ p \作为前缀,而在s $中包含一些$ s \作为后缀。更具体地说,计数问题是输出$ | \ Mathit {ans_i} | $,而报告问题是输出每个迭代$ i $的所有元素。令$σ$表示字母大小,对于一系列字符串$ a $,$ \ vert a \ vert = \ sum_ {u \ in} | u | $。然后,我们表明$ o((((\ vert p \ vert + \ vert s \ vert)\logσ)$ - 时间预处理,每次迭代的计数和报告问题的解决方案最高可输出$ o(| o o(| t_i | \logσ)$(| t_i | \logσ)$ o(t_i | t_i | t_i | t_i | t_i | fligin | \ nime | | | | | | | | | \ mathit。对于$ \ vert p \ vert p \ vert +vert +vert s \ vert $,用于整数字母的预处理时间可以简化为$ o(\ vert p \ vert +\ vert s \ vert)$。我们的算法可能应用于网络流量分类。
Let two static sequences of strings $P$ and $S$, representing prefix and suffix conditions respectively, be given as input for preprocessing. For the query, let two positive integers $k_1$ and $k_2$ be given, as well as a string $T$ given in an online manner, such that $T_i$ represents the length-$i$ prefix of $T$ for $1 \leq i \leq |T|$. In this paper we are interested in computing the set $\mathit{ans_i}$ of distinct substrings $w$ of $T_i$ such that $k_1 \leq |w| \leq k_2$, and $w$ contains some $p \in P$ as a prefix and some $s \in S$ as a suffix. More specifically, the counting problem is to output $|\mathit{ans_i}|$, whereas the reporting problem is to output all elements of $\mathit{ans_i}$, for each iteration $i$. Let $σ$ denote the alphabet size, and for a sequence of strings $A$, $\Vert A\Vert=\sum_{u\in A}|u|$. Then, we show that after $O((\Vert P\Vert +\Vert S\Vert)\logσ)$-time preprocessing, the solutions for the counting and reporting problems for each iteration up to $i$ can be output in $O(|T_i| \logσ)$ and $O(|T_i| \logσ+ |\mathit{ans_i}|)$ total time. The preprocessing time can be reduced to $O(\Vert P\Vert +\Vert S\Vert)$ for integer alphabets of size polynomial with regard to $\Vert P\Vert +\Vert S\Vert$. Our algorithms have possible applications to network traffic classification.