Computing minimal and maximal suffixes of a substring

Maxim Babenko; Gawrychowski P.; Kociumaka T.; Kolesnichenko I.; Starikovskaya T.

doi:10.1016/j.tcs.2015.08.023

Publications

?

Computing minimal and maximal suffixes of a substring

Theoretical Computer Science. 2016. Vol. 638. P. 112-121.

Maxim Babenko, Gawrychowski P., Kociumaka T., Kolesnichenko I., Starikovskaya T.

We consider the problems of computing the maximal and the minimal non-empty suffixes of substrings of a longer text of length . n. For the minimal suffix problem we show that for every . τ, . 1≤τ≤logn, there exists a linear-space data structure with . O(τ) query time and . O(nlogn/τ) preprocessing time. As a sample application, we show that this data structure can be used to compute the Lyndon decomposition of any substring of the text in . O(kτ) time, where . k is the number of distinct factors in the decomposition. For the maximal suffix problem, we give a linear-space structure with . O(1) query time and . O(n) preprocessing time. In other words, we simultaneously achieve both the optimal query time and the optimal construction time. © 2015 Elsevier B.V.

Language: English

Full text

DOI

Keywords: data structures lexicographic order substring queries Maximal suffix Minimal suffix

On Minimal and Maximal Suffixes of a Substring

Babenko M., Kolesnichenko I., Starikovskaya T., , in : Lecture Notes in Computer Science. Vol. 7922: Proceedings of the 24th Symposium on Combinatorial Pattern Matching.: Berlin : Springer, 2013. P. 28-37.

Lexicographically minimal and lexicographically maximal suffixes of a string are fundamental notions of stringology. It is well known that the lexicographically minimal and maximal suffixes of a given string S can be computed in linear time and space by constructing a suffix tree or a suffix array of S. Here we consider the case when ...

Added: October 30, 2013

Computing Minimal and Maximal Suffixes of a Substring Revisited

Babenko M. A., Kociumaka T., Gawrychowski P. et al., , in : Lecture Notes in Computer Science. Vol. 8486: Proceedings of the 25th Annual Symposium on Combinatorial Pattern Matching.: Springer, 2014. P. 30-39.

We revisit the problems of computing the maximal and the minimal non-empty suffixes of a substring of a longer text of length n, introduced by Babenko, Kolesnichenko and Starikovskaya [CPM’13]. For the minimal suffix problem we show that for any 1 ≤ τ ≤ logn there exists a linear-space data structure with(τ)query time and(nlogn/τ)preprocessing time. As a sample application, we show that ...

Added: June 24, 2014

The inverted multi-index

Babenko A., Lempitsky V., , in : Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012). : Providence : IEEE, 2012. P. 3069-3076.

A new data structure for efficient similarity search in very large dataseis of high-dimensional vectors is introduced. This structure called the inverted multi-index generalizes the inverted index idea by replacing the standard quantization within inverted indices with product quantization. For very similar retrieval complexity and preprocessing time, inverted multi-indices achieve a much denser subdivision of ...

Added: October 1, 2014

Wavelet Trees Meet Suffix Trees

Babenko M. A., Gawrychowski P., Kociumaka T. et al., , in : Proceedings of the ACM-SIAM Symposium on Discrete Algorithms. : San Diego : SIAM, 2015. P. 572-591.

We present an improved wavelet tree construction algorithm and discuss its applications to a number of rank/select problems for integer keys and strings. Given a string of length n over an alphabet of size ω ≤ n, our method builds the wavelet tree in O(n log ω √log n) time, improving upon the state-of-the-art algorithm ...

Added: October 4, 2014

АНАЛИЗ ПРОИЗВОДИТЕЛЬНОСТИ СТРАТЕГИЙ СИНХРОНИЗАЦИИ ПОТОКОВ В СТРУКТУРАХ ДАННЫХ, ОСНОВАННЫХ НА FLAT-COMBINING

Галимуллин М. Ф., Kalishenko E., Рапоткин Н. А., Известия Санкт-Петербургского государственного электротехнического университета ЛЭТИ 2016 № 7 С. 13-23

Deals with the development of threads synchronizing strategies based on the creation of concurrent «flat-combining» data structures as well as research of their performance. The paper considers «flat-combining» approach and its implementation in the library libcds, the development of thread synchronization strategy and its possible implementations. The efficiency of synchronization strategies usage is researched on ...

Added: November 1, 2018

Cascade Heap: Towards Time-Optimal Extractions

Babenko M. A., Kolesnichenko I., Smirnov I., Theory of Computing Systems 2019 Vol. 63 No. 4 P. 637-646

Heaps are well-studied fundamental data structures, having myriads of applications, both theoretical and practical. We consider the problem of designing a heap with an “optimal” extract-min operation. Assuming an arbitrary linear ordering of keys, a heap with n elements typically takes O(log n) time to extract the minimum. Extracting all elements faster is impossible as ...

Added: December 6, 2019

Lecture Notes in Computer Science

Berlin, Heidelberg : Springer, 2017

The 12th issue of LNCS Transactions on Petri Nets and Other Models of Concurrency (ToPNoC) contains revised and extended versions of a selection of the best papers from the workshops held at the 37th International Conference on Application and Theory of Petri Nets and Concurrency (Petri Nets 2016, Toruń, Poland, 19–24 June 2016), and the ...

Added: September 27, 2017

Algorithms and Data Structures. WADS 2019. Lecture Notes in Computer Science

Springer, 2019

16th International Symposium, WADS 2019, Edmonton, AB, Canada, August 5–7, 2019, Proceedings ...

Added: October 26, 2021

Automata Equipped with Auxiliary Data Structures and Regular Realizability Problems

Rubtsov A. A., Vyalyi M., , in : Descriptional Complexity of Formal Systems: 23rd IFIP WG 1.02 International Conference, DCFS 2021, Virtual Event, September 5, 2021, Proceedings. : Springer, 2021. P. 150-162.

Added: February 2, 2022

Технологии разработки объектно-ориентированных программ на языке С++. Часть 1. Основы структурного программирования на алгоритмическом языке С++

Полякова О. А., Пермь : Издательство Пермского национального исследовательского политехнического университета, 2019

The article deals with the application of the basic principles of structured programming in complex programs systems in the high-level language C ++, which are demonstrated on meaningful examples. ...

Added: August 31, 2020

Proceedings of the 21st International Symposium on String Processing and Information Retrieval

Springer, 2014

This book constitutes the proceedings of the 21st International Symposium on String Processing and Information Retrieval, SPIRE 2014, held in Ouro Preto, Brazil, in October 2014. The 20 full and 6 short papers included in this volume were carefully reviewed and selected from 45 submissions. The papers focus not only on fundamental algorithms in string ...

Added: October 4, 2014

A suffix tree or not a suffix tree?

Vildhoj H. W., Starikovskaya T., , in : Combinatorial Algorithms. 25th International Workshop, IWOCA 2014, Duluth, MN, USA, October 15-17, 2014, Revised Selected Papers. Vol. 8986.: Springer, 2014. P. 338-350.

In this paper we study the structure of suffix trees. Given an unlabelled tree $\tau$ on $n$ nodes and suffix links of its internal nodes, we ask the question "Is $\tau$ a suffix tree?", i.e., is there a string $S$ whose suffix tree has the same topological structure as $\tau$? We place no restrictions on ...

Added: October 4, 2014

Cross-Document Pattern Matching

Kucherov G., Nekrich Y., Starikovskaya T., , in : Lecture Notes in Computer Science. Vol. 7354: Proceedings of the 23rd Symposium on Combinatorial Pattern Matching.: Berlin : Springer, 2012. P. 196-207.

We study a new variant of the string matching problem called {\em cross-document string matching}, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern itself is a substring of another document. Several variants of this problem are considered, ...

Added: October 30, 2013

Fundamentals of Computation Theory, 22nd International Symposium, FCT 2019, Copenhagen, Denmark, August 12-14, 2019, Proceedings

Springer, 2019

Added: August 4, 2019

Lecture Notes in Computer Science

Berlin : Springer, 2012

This book constitutes the refereed proceedings of the 23rd Annual Symposium on Combinatorial Pattern Matching, CPM 2012, held in Helsinki, Finalnd, in July 2012. The 33 revised full papers presented together with 2 invited talks were carefully reviewed and selected from 60 submissions. The papers address issues of searching and matching strings and more complicated patterns ...

Added: October 30, 2013

Time-Space Trade-Offs for the Longest Common Substring Problem

Vildhoj H. W., Starikovskaya T., , in : Lecture Notes in Computer Science. Vol. 7922: Proceedings of the 24th Symposium on Combinatorial Pattern Matching.: Berlin : Springer, 2013. P. 223-234.

Lexicographically minimal and lexicographically maximal suffixes of a string are fundamental notions of stringology. It is well known that the lexicographically minimal and maximal suffixes of a given string $S$ can be computed in linear time and space by constructing a suffix tree or a suffix array of $S$. Here we consider the case when ...

Added: October 30, 2013

Computing Discriminating and Generic Words

Kucherov G., Nekrich Y., Starikovskaya T., , in : Lecture Notes in Computer Science. Vol. 7608: Proceedings of the 19th International Symposium on String Processing and Information Retrieval.: Berlin : Springer, 2012. P. 307-317.

We study the following three problems of computing generic or discriminating words for a given collection of documents. Given a pattern $P$ and a threshold $d$, we want to report (i) all longest extensions of $P$ which occur in at least $d$ documents, (ii) all shortest extensions of $P$ which occur in less than $d$ ...

Added: October 30, 2013

Hybrid neural network and bi-criteria tabu-machine: comparison of new approaches to maximum clique problem

Babkina T. S., Demidovskij A., Babkin E., International Journal of Big Data Intelligence 2018 Vol. 5 No. 3 P. 143-155

This paper presents two new approaches to solving a classical NP-hard problem of maximum clique problem (MCP), which frequently arises in the domain of information management, including design of database structures and big data processing. In our research, we are focusing on solving that problem using the paradigm of artificial neural networks. The first approach ...

Added: October 3, 2018