• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Глава

Computing Longest Common Substrings Via Suffix Arrays

P. 64-75.
Babenko M. A., Starikovskaya T.

Given a set of $N$ strings $A = \set{\alpha_1, \ldots, \alpha_N}$ of total length $n$ over alphabet~$\Sigma$ one may ask to find, for a fixed integer $K$, $2 \le K \le N$, the longest substring $\beta$ that appears in at least $K$ strings in $A$. It is known that this problem can be solved in $O(n)$ time with the help of suffix trees. However, the resulting algorithm is rather complicated. Also, its running time and memory consumption may depend on~$\abs{\Sigma}$.   This paper presents an alternative, remarkably simple approach to the above problem, which relies on the notion of suffix arrays. Once the suffix array of some auxiliary $O(n)$-length string is computed, one needs a simple $O(n)$-time postprocessing to find the requested longest substring. Since a number of efficient and simple linear-time algorithms for constructing suffix arrays has been recently developed (with constant not depending on $|\Sigma|$), our approach seems to be quite practical.