### Book

## Lecture Notes in Computer Science

This book constitutes the refereed proceedings of the 25th Annual Symposium on Combinatorial Pattern Matching, CPM 2014, held in Moscow, Russia, in June 2014. The 28 revised full papers presented together with 5 invited talks were carefully reviewed and selected from 54 submissions. The papers address issues of searching and matching strings and more complicated patterns such as trees; regular expressions; graphs; point sets; and arrays. The goal is to derive combinatorial properties of such structures and to exploit these properties in order to achieve superior performance for the corresponding computational problems. The meeting also deals with problems in computational biology; data compression and data mining; coding; information retrieval; natural language processing; and pattern recognition.

We revisit the problems of computing the maximal and the minimal non-empty suffixes of a substring of a longer text of length *n*, introduced by Babenko, Kolesnichenko and Starikovskaya [CPM’13]. For the minimal suffix problem we show that for any 1 ≤ *τ* ≤ log*n* there exists a linear-space data structure with(τ)query time and(nlogn/τ)preprocessing time. As a sample application, we show that this data structure can be used to compute the Lyndon decomposition of any substring of the text in(kτ)time, where *k* is the number of distinct factors in the decomposition. For the maximal suffix problem we give a linear-space structure with(1)query time and(n)preprocessing time, i.e., we manage to achieve both the optimal query and the optimal construction time simultaneously.

This book constitutes the refereed proceedings of the 20th International Symposium on String Processing and Information Retrieval, SPIRE 2013, held in Jerusalem, Israel, in October 2013. The 18 full papers, 10 short papers were carefully reviewed and selected from 60 submissions. The program also featured 4 keynote speeches. The following topics are covered: fundamentals algorithms in string processing and information retrieval; SP and IR techniques as applied to areas such as computational biology, DNA sequencing, and Web mining.

One of the key advances in genome assembly that has led to a significant improvement in contig lengths has been improved algorithms for utilization of paired reads (mate-pairs). While in most assemblers, mate-pair information is used in a post-processing step, the recently proposed Paired de Bruijn Graph (PDBG) approach incorporates the mate-pair information directly in the assembly graph structure. However, the PDBG approach faces difficulties when the variation in the insert sizes is high. To address this problem, we first transform mate-pairs into edge-pair histograms that allow one to better estimate the distance between edges in the assembly graph that represent regions linked by multiple mate-pairs. Further, we combine the ideas of mate-pair transformation and PDBGs to construct new data structures for genome assembly: pathsets and pathset graphs.

In this paper, we present a modification of dynamic programming algorithms (DPA), which we denote as graphical algorithms (GrA). For some single machine scheduling problems, it is shown that the time complexity of the GrA is less than the time complexity of the standard DPA. Moreover, the average running time of the GrA is often essentially smaller. A GrA can also solve large-scale instances and instances, where the parameters are not integer. For some problems, GrA has a polynomial time complexity in contrast to a pseudo-polynomial complexity of a DPA.

We study a new variant of the pattern matching problem called *cross-document pattern matching*, which is the problem of indexing a collection of documents to support an efficient search for a pattern in a selected document, where the pattern itself is a substring of another document. Several variants of this problem are considered, and efficient linear space solutions are proposed with query time bounds that either do not depend at all on the pattern size or depend on it in a very limited way (doubly logarithmic). As a side result, we propose an improved solution to the *weighted ancestor* problem.

Information systems have been developed in parallel with computer science, although information systems have roots in different disciplines including mathematics, engineering, and cybernetics. Research in information systems is by nature very interdisciplinary. As it is evidenced by the chapters in this book, dynamics of information systems has several diverse applications. The book presents the state-of-the-art work on theory and practice relevant to the dynamics of information systems. First, the book covers algorithmic approaches to numerical computations with infinite and infinitesimal numbers. Also the book presents important problems arising in service-oriented systems, such as dynamic composition, analysis of modern service-oriented information systems, and estimation of customer service times on a rail network from GPS data. After that, the book addresses the complexity of the problems arising in stochastic and distributed systems. In addition, the book discusses modulating communication for improving multi-agent learning convergence. Network issues, in particular minimum risk maximum clique problems, vulnerability of sensor networks, influence diffusion, community detection, and link prediction in social network analysis, as well as a comparative analysis of algorithms for transmission network expansion planning are described in subsequent chapters. We thank all the authors and anonymous referees for their advice and expertise in providing valuable contributions, which improved the quality of this book. Furthermore, we want to thank Springer for helping us to produce this book.

In this paper, we consider algorithms involved in the computation of the Duquenne–Guigues basis of implications. The most widely used algorithm for constructing the basis is Ganter’s Next Closure, designed for generating closed sets of an arbitrary closure system. We show that, for the purpose of generating the basis, the algorithm can be optimized. We compare the performance of the original algorithm and its optimized version in a series of experiments using artificially generated and real-life datasets. An important computationally expensive subroutine of the algorithm generates the closure of an attribute set with respect to a set of implications. We compare the performance of three algorithms for this task on their own, as well as in conjunction with each of the two algorithms for generating the basis. We also discuss other approaches to constructing the Duquenne–Guigues basis.

We revisit the problems of computing the maximal and the minimal non-empty suffixes of a substring of a longer text of length *n*, introduced by Babenko, Kolesnichenko and Starikovskaya [CPM’13]. For the minimal suffix problem we show that for any 1 ≤ *τ* ≤ log*n* there exists a linear-space data structure with(τ)query time and(nlogn/τ)preprocessing time. As a sample application, we show that this data structure can be used to compute the Lyndon decomposition of any substring of the text in(kτ)time, where *k* is the number of distinct factors in the decomposition. For the maximal suffix problem we give a linear-space structure with(1)query time and(n)preprocessing time, i.e., we manage to achieve both the optimal query and the optimal construction time simultaneously.

We consider certain spaces of functions on the circle, which naturally appear in harmonic analysis, and superposition operators on these spaces. We study the following question: which functions have the property that each their superposition with a homeomorphism of the circle belongs to a given space? We also study the multidimensional case.

We consider the spaces of functions on the m-dimensional torus, whose Fourier transform is p -summable. We obtain estimates for the norms of the exponential functions deformed by a C1 -smooth phase. The results generalize to the multidimensional case the one-dimensional results obtained by the author earlier in “Quantitative estimates in the Beurling—Helson theorem”, Sbornik: Mathematics, 201:12 (2010), 1811 – 1836.

We consider the spaces of function on the circle whose Fourier transform is p-summable. We obtain estimates for the norms of exponential functions deformed by a C1 -smooth phase.

The problem of minimizing the root mean square deviation of a uniform string with clamped ends from an equilibrium position is investigated. It is assumed that the initial conditions are specified and the ends of the string are clamped. The Fourier method is used, which enables the control problem with a partial differential equation to be reduced to a control problem with a denumerable system of ordinary differential equations. For the optimal control problem in the l2 space obtained, it is proved that the optimal synthesis contains singular trajectories and chattering trajectories. For the initial problem of the optimal control of the vibrations of a string it is also proved that there is a unique solution for which the optimal control has a denumerable number of switchings in a finite time interval.

For a class of optimal control problems and Hamiltonian systems generated by these problems in the space *l *2, we prove the existence of extremals with a countable number of switchings on a finite time interval. The optimal synthesis that we construct in the space *l *2 forms a fiber bundle with piecewise smooth two-dimensional fibers consisting of extremals with a countable number of switchings over an infinite-dimensional basis of singular extremals.

This proceedings publication is a compilation of selected contributions from the “Third International Conference on the Dynamics of Information Systems” which took place at the University of Florida, Gainesville, February 16–18, 2011. The purpose of this conference was to bring together scientists and engineers from industry, government, and academia in order to exchange new discoveries and results in a broad range of topics relevant to the theory and practice of dynamics of information systems. Dynamics of Information Systems: Mathematical Foundation presents state-of-the art research and is intended for graduate students and researchers interested in some of the most recent discoveries in information theory and dynamical systems. Scientists in other disciplines may also benefit from the applications of new developments to their own area of study.

In this paper, we construct a new distribution corresponding to a real noble gas as well as the equation of state for it.