Minimal Discriminating Words Problem Revisited

Kucherov G.; Nekrich Y.; Gawrychowski P.; T. Starikovskaya

?

Minimal Discriminating Words Problem Revisited

P. 129–140.

Kucherov G., Nekrich Y., Gawrychowski P., Starikovskaya T.

We revisit two variants of the problem of computing minimal discriminating words studied in [5]. Given a pattern P and a threshold d, we want to report (i) all shortest extensions of P which occur in less than d documents, and (ii) all shortest extensions of P which occur only in d selected documents. For the rst problem, we give an optimal solution with constant time per output word. For the second problem, we propose an algorithm with running time O(jPj + d (1 + output)) improving the solution of [5].

Language: English

Text on another site

Keywords: algorithms алгоритмы обработки слов

In book

Lecture Notes in Computer Science

Vol. 8214: Proceedings of the 20th Symposium on String Processing and Information Retrieval. , Berlin: Springer, 2013.

Computing Lempel-Ziv Factorization Online

Starikovskaya T., , in: Lecture Notes in Computer ScienceVol. 7464: Proceedings of the 37th International Symposium on Mathematical Foundations of Computer Science.: Berlin: Springer, 2012. P. 789–799.

We present an algorithm which computes the Lempel-Ziv factorization of a word $W$ of length $n$ on an alphabet $\Sigma$ of size $\sigma$ online in the following sense: it reads $W$ starting from the left, and, after reading each $r = O(\log_{\sigma}{n})$ characters of $W$, updates the Lempel-Ziv factorization. The algorithm requires $O(n\log\sigma)$ bits of ...

Added: October 30, 2013

Computing Discriminating and Generic Words

Kucherov G., Nekrich Y., Starikovskaya T., , in: Lecture Notes in Computer ScienceVol. 7608: Proceedings of the 19th International Symposium on String Processing and Information Retrieval.: Berlin: Springer, 2012. P. 307–317.

We study the following three problems of computing generic or discriminating words for a given collection of documents. Given a pattern $P$ and a threshold $d$, we want to report (i) all longest extensions of $P$ which occur in at least $d$ documents, (ii) all shortest extensions of $P$ which occur in less than $d$ ...

Added: October 30, 2013

On Minimal and Maximal Suffixes of a Substring

Babenko M., Kolesnichenko I., Starikovskaya T., , in: Lecture Notes in Computer ScienceVol. 7922: Proceedings of the 24th Symposium on Combinatorial Pattern Matching.: Berlin: Springer, 2013. P. 28–37.

Lexicographically minimal and lexicographically maximal suffixes of a string are fundamental notions of stringology. It is well known that the lexicographically minimal and maximal suffixes of a given string S can be computed in linear time and space by constructing a suffix tree or a suffix array of S. Here we consider the case when ...

Added: October 30, 2013

Cross-document Pattern Matching

Kopelowitz T., Kucherov G., Nekrich Y. et al., Journal of Discrete Algorithms 2013

We study a new variant of the pattern matching problem called cross-document pattern matching, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern itself is a substring of another document. Several variants of this problem are considered, and efficient linear ...

Added: October 30, 2013

Time-Space Trade-Offs for the Longest Common Substring Problem

Vildhoj H. W., Starikovskaya T., , in: Lecture Notes in Computer ScienceVol. 7922: Proceedings of the 24th Symposium on Combinatorial Pattern Matching.: Berlin: Springer, 2013. P. 223–234.

Lexicographically minimal and lexicographically maximal suffixes of a string are fundamental notions of stringology. It is well known that the lexicographically minimal and maximal suffixes of a given string $S$ can be computed in linear time and space by constructing a suffix tree or a suffix array of $S$. Here we consider the case when ...

Added: October 30, 2013

STAND: New tool for performance estimation of the block data processing algorithms in high-load systems

Bashun, V., Minchenkov, V., , in: 13th Conference of Open Innovations Association FRUCT.: IEEE Computer Society, 2017. P. 101–110.

The main goal of this work is to present the developed research tool to find, investigate and analyze hidden dependences between parameters of the hardware/software platforms (such as influence of NUMA architecture, memory page size, etc) and the performance of block data processing algorithms. The new toolset (STAND) allows performance estimation and comparison of block ...

Added: November 1, 2018

An empirical scrutinization of four crisp clustering methods with four distance metrics and one straightforward interpretation rule

Alvandyan T., Shalileh S., Doklady Mathematics 2024

Clustering has always been in great demand by scientific and industrial communities. However, due to the lack of ground truth, interpreting its obtained results can be debatable. The current research provides an empirical benchmark on the efficiency of three popular and one recently proposed crisp clustering methods. To this end, we extensively analyzed these (four) ...

Added: November 30, 2024

Models, Algorithms, and Technologies for Network Analysis / From the 4th International Conference on Network Analysis

NY: Springer, 2016.

Added: July 7, 2015

13th International Symposium on Parameterized and Exact Computation (IPEC 2018)

Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2019.

Added: November 13, 2019

Competition Law for the Digital Economy

Edward Elgar Publishing, 2019.

The digital economy is gradually gaining traction through a variety of recent technological developments, including the introduction of the Internet of things, artificial intelligence and markets for data. This innovative book contains contributions from leading competition law scholars who map out and investigate the anti-competitive effects that are developing in the digital economy. ...

Added: August 4, 2020

Pathset Graphs: A Novel Approach for Comprehensive Utilization of Paired Reads in Genome Assembly

Pham S. K., Antipov D., Sirotkin Alexander et al., Journal of Computational Biology 2013 Vol. 20 No. 4 P. 359–371

One of the key advances in genome assembly that has led to a significant improvement in contig lengths has been improved algorithms for utilization of paired reads (mate-pairs). While in most assemblers, mate-pair information is used in a post-processing step, the recently proposed Paired de Bruijn Graph (PDBG) approach incorporates the mate-pair information directly in ...

Added: March 21, 2014

Dynamics of Information Systems: Algorithmic Approaches

NY: Springer, 2013.

Information systems have been developed in parallel with computer science, although information systems have roots in different disciplines including mathematics, engineering, and cybernetics. Research in information systems is by nature very interdisciplinary. As it is evidenced by the chapters in this book, dynamics of information systems has several diverse applications. The book presents the state-of-the-art ...

Added: December 12, 2013

Fast approximate energy minimization with label costs

Delong A., Osokin A., Isack H. et al., , in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010).: San Francisco: IEEE, 2010. P. 2173–2180.

The α-expansion algorithm [4] has had a significant impact in computer vision due to its generality, effectiveness, and speed. Thus far it can only minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main contribution is to extend α-expansion so that it can simultaneously optimize “label costs” as well. An energy with label ...

Added: October 18, 2017

21st International Symposium, Fundamentals of Computation Theory 2017, FCT 2017

Springer, 2017.

This book constitutes the refereed proceedings of the 21st International Symposium on Fundamentals of Computation Theory, FCT 2017, held in Bordeaux, France, in September 2017. The 29 revised full papers and 5 invited papers presented were carefully reviewed and selected from 99 submissions. The papers cover topics of all aspects of theoretical computer science, in ...

Added: September 14, 2017

Algorithms for Hub Label Optimization

Babenko M. A., Goldberg A. V., Gupta A. ,. et al., ACM Transactions on Algorithms 2016 Vol. 13 No. 1 P. 16:1–16:17

We consider the hub label optimization problem, which arises in designing fast preprocessing-based shortest- path algorithms. We give O(log n)-approximation algorithms for the objectives of minimizing the maximum label size (l∞-norm) and simultaneously minimizing a constant number of lp-norms. Prior to this, an O(log n)- approximation algorithm was known [Cohen et al. 2003] only for ...

Added: January 12, 2017

Entropy Dimension Reduction Method for Randomized Machine Learning Problems

Popkov Y., Dubnov Y. A., Popkov A. Y., Automation and Remote Control 2018 Vol. 79 No. 11 P. 2038–2051

The direct and inverse projections (DIP) method was proposed to reduce the feature space to the given dimensions oriented to the problems of randomized machine learning and based on the procedure of “direct” and “inverse” design. The “projector” matrices are determined by maximizing the relative entropy. It is suggested to estimate the information losses by ...

Added: February 12, 2019

Algorithms and Data Structures. WADS 2019. Lecture Notes in Computer Science

Springer, 2019.

16th International Symposium, WADS 2019, Edmonton, AB, Canada, August 5–7, 2019, Proceedings ...

Added: October 26, 2021

Проблема подтверждения подлинности произведения искусства: междисциплинарная перспектива

Карташева А. А., Философские проблемы информационных технологий и киберпространства 2024

The problem of confirming the authenticity of works of art is especially acute in the digital environment. The issue of authentication is central to both intellectual property law and cultural communication strategies. Originality is a necessary parameter for a work to be recognized as original. Although the concept of «originality» is not enshrined in legal acts, it is an important element for the construction of «authenticity», which is often produced by «cultural intermediaries». The latter can be not only ...

Added: November 30, 2023

Дискретные модели в теории управляющих систем: Х Международная конференция, Москва и Подмосковье, 23-25 мая 2018 г. : Труды

МГУ, МАКС Пресс, 2018.

The proceedings of the tenth international conference "Discrete Models in Control Systems Theory" (Moscow and Moscow Region, May 23-25, 2018) includes 100 papers on such topics as discrete functional systems, properties of discrete functions, synthesis and complexity of control systems, reliability, control and diagnostics of control systems, automata, graph theory, combinatorics, coding theory, mathematical methods ...

Added: June 14, 2018

Диффамация и алгоритмы: Новое измерение старой проблемы

Diskin E., Закон 2024 № 1 С. 24–28

The issue of the protection of legitimate rights of personas who were defamed is not new in Russian legal science. The problem of protection of honor and dignity was known to classical Roman law, was the subject of study of pre-revolutionary and Soviet lawyers. However, the classic civilistic constructions formulated in the Civil Code were ...

Added: January 30, 2024

Proceedings of the 2014 ACM conference on SIGCOMM

NY: ACM Press, 2014.

Added: October 10, 2014

Proceedings of the 6th International Conference on Similarity Search and Applications (SISAP 2013), Lecture Notes in Computer Science

Berlin, Heidelberg: Springer, 2013.

This volume contains the papers presented at the 6th International Conference on Similarity Search and Applications (SISAP 2013), held at A Coruna, Spain, during October 2–4, 2013. The International Conference on Similarity Search and Applications (SISAP) is an annual forum for researchers and application developers in the area of similarity data management. It aims at the technological problems shared ...

Added: September 27, 2013

Diagnostic Test Approaches to Machine Learning and Commonsense Reasoning Systems

Naidenova X., Ignatov D. I., Hershey: IGI Global, 2012.

The consideration of symbolic machine learning algorithms as an entire class will make it possible, in the future, to generate algorithms, with the aid of some parameters, depending on the initial users’ requirements and the quality of solving targeted problems in domain applications. Diagnostic Test Approaches to Machine Learning and Commonsense Reasoning Systems surveys, analyzes, and ...

Added: December 3, 2012

Optimal Control Algorithms and Their Analysis for Short-Term Scheduling in Manufacturing Systems

Соколов Б. В., Ivanov D., Dolgui A., Algorithms 2018 Vol. 11 No. 5 P. 57

Added: February 11, 2020