Book chapter
Технология «динамического молекулярного портрета» в вычислительном анализе белков и биомембран
In book
In this paper, we study the Maximum Happy Vertices and the Maximum Happy Edges problems (MHV and MHE for short). Very recently, the problems attracted a lot of attention and were studied in Agrawal ’17, Aravind et al. ’16, Choudhari and Reddy ’18, Misra and Reddy ’17. Main focus of our work is lower bounds on the computational complexity of these problems. Established lower bounds can be divided into the following groups: NP-hardness of the above guarantee parameterization, kernelization lower bounds (answering questions of Misra and Reddy ’17), exponential lower bounds under the Set Cover Conjecture and the Exponential Time Hypothesis, and inapproximability results. Moreover, we present an O∗(ℓk)O∗(ℓk) randomized algorithm for MHV and an O∗(2k)O∗(2k) algorithm for MHE, where ℓℓ is the number of colors used and k is the number of required happy vertices or edges. These algorithms cannot be improved to subexponential taking proved lower bounds into account.
The current state of information systems of the Pushchino Research Center of the Russian Academy of Sciences to successfully solve the problems of computational biology.
One of the key advances in genome assembly that has led to a significant improvement in contig lengths has been improved algorithms for utilization of paired reads (mate-pairs). While in most assemblers, mate-pair information is used in a post-processing step, the recently proposed Paired de Bruijn Graph (PDBG) approach incorporates the mate-pair information directly in the assembly graph structure. However, the PDBG approach faces difficulties when the variation in the insert sizes is high. To address this problem, we first transform mate-pairs into edge-pair histograms that allow one to better estimate the distance between edges in the assembly graph that represent regions linked by multiple mate-pairs. Further, we combine the ideas of mate-pair transformation and PDBGs to construct new data structures for genome assembly: pathsets and pathset graphs.
Optimization Approaches for Solving String Selection Problems provides an overview of optimization methods for a wide class of genomics-related problems in relation to the string selection problems. This class of problems addresses the recognition of similar characteristics or differences within biological sequences. Specifically, this book considers a large class of problems, ranging from the closest string and substring problems, to the farthest string and substring problems, to the far from most string problem. Each problem includes a detailed description, highlighting both biological and mathematical features and presents state-of-the-art approaches.
Molecular surfaces are one of the key players in processes of bimolecular recognition and interaction. Nowadays, state of the art methods exist for visualizing molecule surface and surface distributed properties in three-dimensional space. However, such visual information could only be analyzed by human eye and therefore prompt to be biased and onerous in case of large sets of objects. Here we present a method to create 2D projections or ”earth maps” of whole protein surface – protein surface topography (PST). Representing complex molecule surfaces as an array of data gives the advantage of simple and pictorial visualization of surface properties. PST can be used to easy visualize conformational changes between different states of molecules, perform group analysis, and reveal common patterns or dissimilarities. It is useful tool to add to docking experiments, illustrating complementary features between ligand and receptor surfaces.