СРАВНЕНИЕ МЕТОДОВ ОБНАРУЖЕНИЯ ИСКЛЮЧИТЕЛЬНЫХ ПОСЛЕДОВАТЕЛЬНОСТЕЙ В ГЕНОМАХ ПРОКАРИОТ

Ершова А. С.; Русинов И. С.; Карягина А. С.; С. А. Спирин; Алексеевский А. В.

doi:10.1134/S0006297918020050

Publications

?

СРАВНЕНИЕ МЕТОДОВ ОБНАРУЖЕНИЯ ИСКЛЮЧИТЕЛЬНЫХ ПОСЛЕДОВАТЕЛЬНОСТЕЙ В ГЕНОМАХ ПРОКАРИОТ

Биохимия. 2018. Т. 83. № 2. С. 225–237.

Ершова А. С., Русинов И. С., Карягина А. С., Spirin S., Алексеевский А. В.

Many proteins need recognition of specific DNA sequences for functioning. The number of recognition sites and their distribution along the DNA might be of biological importance. For example, the number of restriction sites is often reduced in prokaryotic and phage genomes to decrease the probability of DNA cleavage by restriction endonucleases. We call a sequence an exceptional one if its frequency in a genome significantly differs from one predicted by some mathematical model. An exceptional sequence could be either under- or over-represented, depending on its frequency in comparison with the predicted one. Exceptional sequences could be considered biologically meaningful, for example, as targets of DNA-binding proteins or as parts of abundant repetitive elements. Several methods to predict frequency of a short sequence in a genome, based on actual frequencies of certain its subsequences, are used. The most popular are methods based on Markov chain models. But any rigorous comparison of the methods has not previously been performed. We compared three methods for the prediction of short sequence frequencies: the maximum-order Markov chain model-based method, the method that uses geometric mean of extended Markovian estimates, and the method that utilizes frequencies of all subsequences including discontiguous ones. We applied them to restriction sites in complete genomes of 2500 prokaryotic species and demonstrated that the results depend greatly on the method used: lists of 5% of the most under-represented sites differed by up to 50%. The method designed by Burge and coauthors in 1992, which utilizes all subsequences of the sequence, showed a higher precision than the other two methods both on prokaryotic genomes and randomly generated sequences after computational imitation of selective pressure. We propose this method as the first choice for detection of exceptional sequences in prokaryotic genomes.

Research target: Biology Computer Science

Language: Russian

DOI

Text on another site

Keywords: марковская модель Markov chain models Compositional bias DNA sequence prokaryotic genome restriction modification system restriction sites Последовательность ДНК Прокариотический геном система рестрикции-модификации сайты рестрикции

SMMR: Sampling-Based MMR Reranking for Faster, More Diverse, and Balanced Recommendations and Retrieval

Ananyeva M., Liakhnovich K., Lashinin O. et al., Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval 2025 P. 2754–2758

Relevance and diversity are critical objectives in modern information retrieval (IR), particularly in recommender systems. Achieving a balance between relevance (exploitation) and diversity (exploration) optimizes user satisfaction and business goals such as catalog coverage and novelty. While existing post-processing reranking methods address this trade-off, they usually rely on greedy strategies, leading to suboptimal outcomes for ...

Added: February 3, 2026

Natural Language Processing and Information Systems : 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Kanazawa, Japan, July 4-6, 2025 : proceedings. Part I

Springer, 2025.

The two-volume set LNCS 15836 and 15837 constitutes the proceedings of the 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, held in Kanazawa, Japan, during July 4–6, 2025. The 33 full papers, 19 short papers and 2 demo papers presented in this volume were carefully reviewed and selected from 120 submissions. ...

Added: February 3, 2026

Measuring Chemical LLM robustness to molecular representations: a SMILES variation-based framework

Tutubalina E., Храбров К., Ганеева В. et al., Journal of Cheminformatics 2025 No. 17 Article 164

The recent integration of natural language processing into chemistry has advanced drug discovery. Molecule representations in language models (LMs) are crucial to enhance chemical understanding. We explored the ability of models to match the same chemical structures despite their different representations. Recognizing the same substance in different representations is an important component of emulating the ...

Added: February 3, 2026

Effect of IGFBP6 Knockdown on Proteins Regulating Exosome Synthesis and Secretion in MDA-MB-231 Cell Line

Antipenko I., Bulletin of Experimental Biology and Medicine 2023 Vol. 175 P. 157–161

One of the potential causes of cancer recurrence is disruption of the cell—cell communication in the primary tumors that is realized, among other things, through secretion and uptake of exosomes by cells. Low expression of the IGFBP6 gene (insulin-like growth factor binding protein 6) is associated with a high recurrence rate and can serve as a prognostic ...

Added: January 30, 2026

Комплексная характеристика пяти штаммов Lactococcus: от фенотипических свойств к геномным особенностям

Antipenko I., Shkurnikov M., Acta Naturae 2025 Т. Том 17 № № 4 (2025) С. 83–92

Эффективность ферментации молочных продуктов зависит от характеристик молочнокислых бактерий, прежде всего от их метаболической активности и устойчивости к бактериофагам, поэтому важно понимать связь между генетическими и фенотипическими особенностями штаммов, используемых в промышленности. Нами проведен комплексный анализ пяти широко применяемых в России штаммов Lactococcus с использованием полногеномного секвенирования и оценки фенотипических свойств. Несмотря на генетическое сходство четырех штаммов L. ...

Added: January 30, 2026

Полное секвенирование генома выявляет вариабельность метаболических и иммунных систем у изолятов Propionibacterium freudenreichii

Antipenko I., Венедюхина С. А., Shkurnikov M., Acta Naturae 2025 Т. Том 17 № № 4 (2025) С. 72–82

Бактерии Propionibacterium freudenreichii играют важную роль в производстве сыров швейцарского типа, однако геномная вариабельность штаммов, влияющая на их технологические свойства, остается недостаточно изученной. Охарактеризованы метаболические и генетические различия промышленных штаммов P. freudenreichii. Сопоставление фенотипических и геномных данных позволяет выявлять маркеры технологически значимых признаков и использовать их для скрининга новых штаммов. Это создает основу для подбора консорциумов с заданными ...

Added: January 30, 2026

Assessing the roles of subjective value and valence in outcome evaluation for consumer products: evidence from behavioral and electrophysiological experiments

Doudou L., Qiang S., Yina A. et al., Acta Psychologica 2026 Vol. 262 P. 1–9

Value-based decision-making is ubiquitous in our daily lives, yet most EEG studies focus on monetary outcomes, with limited attention to how the brain encodes the subjective value and valence of consumer products during outcome evaluation. To address these questions, we set up a novel three-stage task to investigate the behavioral regularities of recall of valence ...

Added: January 29, 2026

Метод преобразования речевого сигнала для улучшения разборчивости речи

Savchenko L., Савченко В. В., Радиотехника и электроника 2025 Т. 70 № 8 С. 753–760

The problem of improving speech intelligibility in voice communication systems is considered. The acute issue of speaker recognition when applying known methods for solving this problem is highlighted. To overcome the specified problem, a new method for transforming the speech signal based on an autoregressive model of the vocal tract and the principle of frequency-selective ...

Added: January 29, 2026

Specification Tests for Jump-Diffusion Models Based on the Characteristic Function

Belomestny D., Grobler G. L., Meintanis S. G. et al., International Statistical Review 2026 P. 1–31

Goodness-of-fit tests are suggested for several popular jump-diffusion processes. The suggested test statistics utilise the marginal characteristic function of the model and its L2-type discrepancy from an empirical counterpart. Model parameters are estimated either by minimising the aforementioned L2-type discrepancy or by maximum likelihood. A hybrid estimation method that uses moment estimation is also proposed ...

Added: January 29, 2026

An Analysis of Sequential Patterns in Datasets for Evaluation of Sequential Recommendations

Klenitskiy A., Володкевич А. А., Pembek A. et al., ACM Transactions on Recommender Systems 2026

Sequential recommender systems are an important and in-demand area of research. These systems aim to use the order of interactions in a user’s history to predict future interactions. The premise is that the order of interactions and sequential patterns play an essential role. Therefore, it is crucial to use datasets that exhibit a sequential structure ...

Added: January 28, 2026

Autoregressive generation strategies for Top-K sequential recommendations

Klenitskiy A., Гусак Д. И., Володкевич А. А. et al., User Modelling and User-Adapted Interaction 2025 No. 35 Article 13

The goal of modern sequential recommender systems is often formulated in terms of next-item prediction. In this paper, we explore the applicability of transformer-based generative models for the Top-K sequential recommendation task, where the goal is to predict items that a user is likely to interact with in the “near future.” This goal aligns with ...

Added: January 26, 2026

Marchenko–Pastur Law for Spectra of Random Weighted Bipartite Graphs

Nadutkina A., Tikhomirov A., Timushev D., Siberian Advances in Mathematics 2025 Vol. 34 P. 146–153

We study the spectra of random weighted bipartite graphs. We establish that, under specific assumptions on the edge probabilities, the symmetrized empirical spectral distribution function of the graph’s adjacency matrix converges to the symmetrized Marchenko-Pastur distribution function. ...

Added: January 26, 2026

Conceptual Knowledge Structures First International Joint Conference, CONCEPTS 2024, Cádiz, Spain, September 9–13, 2024, Proceedings

Obiedkov S., Switzerland: Springer, 2024.

This book constitutes the proceedings of the First International Joint Conference on Conceptual Knowledge Structures, CONCEPTS 2024, which took place in Cádiz, Spain, during September 9-13, 2024. The conference is an amalgamation of the 18th International Conference on Formal Concept Analysis (ICFCA); the 17th International Conference on Concept Lattices and Their Applications (CLA); and the 28th ...

Added: January 23, 2026

Cooperative games with fuzzy characteristic functions on concept lattices

Kemgne M. W., Njionou B. B., Ignatov D. I. et al., International Journal of Approximate Reasoning 2025 Vol. 186 P. 1–18

This paper introduces cooperative games with transferable utilities and fuzzy characteristic functions on concept lattices. While previous works have independently addressed games with fuzzy payoffs and games restricted to structured coalition systems such as lattices, our approach combines both perspectives. We consider cooperative settings where coalition formation is constrained by a concept lattice structure, and ...

Added: January 23, 2026

Run time dynamic digital twins and dynamic digital twins networks

Vodyaho A., Delhibabu R., Ignatov D. I. et al., Future Generation Computer Systems 2025 Vol. 172 P. 1–18

Digital twins are widely used for building various types of cyber–physical systems. There are a huge number of publications devoted to the use of digital twins in production systems. Much less attention is paid to the issues of building runtime digital twins. The article describes an approach to building complex distributed cyber–physical systems with a ...

Added: January 23, 2026

LAMBO: Landmarks Augmentation With Manifold-Barycentric Oversampling

Bespalov Y., Buzun N., Kachan O. et al., IEEE Access 2022 No. 10 Article 3219934

We propose the first data augmentation method based on optimal transport theory, with the generated data being guaranteed to belong to the original data manifold. The proposed algorithm randomly samples a clique in the nearest-neighbors graph representing the data knowledge and computes the Wasserstein barycenter between the neighbours with random uniform weights. Being extremely natural- ...

Added: January 21, 2026

Blurred Magnitude Homology of Functional Connectome for ASD Diagnosis

Alexander Kachura, Vsevolod Chernyshev, Kachan O. et al., Frontiers in Psychiatry 2026 Vol. 16 Article 1677282

Autism spectrum disorder (ASD) is one of the most common neurodevelopmental disorders. Existing studies show that adults with ASD may experience accelerated or altered neurocognitive aging. Consequently, cognitive decline in people with ASD can be delayed if timely measures are taken to treat this disorder. This study focuses on the development of a new algorithm ...

Added: January 21, 2026

19th Annual Conference, TAMC 2025, Jinan, China, September 19–21, 2025, Proceedings. Theory and Applications of Models of Computation. Lecture Notes in Computer Science (LNCS, volume 16084)

Springer, 2026.

This book constitutes the proceedings of the 19th Annual Conference on Theory and Applications of Models of Computation, TAMC 2025, which was held in Jinan, China, during September 19–21, 2025. ...

Added: January 20, 2026

Выбор средств защиты информации в автоматизированных системах на основе марковских моделей кибератак

Трапезников Е. В., Безопасность информационных технологий 2025 Т. 30 № 4 С. 102–113

Одной из основных проблем обеспечения информационной безопасности автоматизированных систем является отсутствие универсальных подходов к количественной оценке их эффективности. В статье рассматривается один из возможных подходов к этой проблеме, основанный на использовании моделей кибератак, описываемых в терминах марковских цепей с поглощающими состояниями. Разработана модель, в которой, в отличие от подобных моделей других авторов, предусмотрено, что атаки ...

Added: October 6, 2025

Эволюция систем рестрикции-модификации, содержащих одну эндонуклеазу рестрикции и две ДНК-метилтрансферазы.

Фокина А. С., Карягина А. С., Русинов И. С. et al., Биохимия 2023 Т. 88 № 2 С. 285–294

Some restriction–modification systems contain two DNA methyltransferases. In the present work, we have classified such systems according to the families of catalytic domains present in the restriction endonucleases and both DNA methyltransferases. Evolution of the restriction–modification systems containing an endonuclease with a NOV_C family domain and two DNA methyltransferases, both with DNA_methylase family domains, was investigated in ...

Added: December 1, 2023

Разработка ранней диагностики болезни Паркинсона и комплексный экономический анализ эффекта от ее внедрения

Гусев Е. И., Блохин В. Е., Vartanov S. et al., Журнал неврологии и психиатрии им. С.С. Корсакова 2021 Т. 121 № 1 С. 9–20

В статье обобщены данные литературы и собственные данные авторов о разработке ранней (доклинической) диагностики болезни Паркинсона (БП). Внедрение этой диагностики будет способствовать использованию профилактической терапии и изменит инвестиции в диагностику и лечение пациентов. В статье подчеркивается, что в настоящее время единственным подходом к ранней диагностике БП является позитронно-эмиссионная томография нигростриатной дофаминергической системы, но она не может быть использована для профилактического обследования из-за высокой стоимости. Авторы считают, что менее специфичным, но более перспективным подходом ...

Added: September 15, 2023

Avoidance of recognition sites of restriction-modification systems is a widespread but not universal anti-restriction strategy of prokaryotic viruses

Rusinov I. S., Ershova A. S., Karyagina A. S. et al., BMC Genomics 2018 Vol. 19 No. 885 P. 1–11

Background. Restriction-modification (R-M) systems protect bacteria and archaea from attacks by bacteriophages and archaeal viruses. An R-M system specifically recognizes short sites in foreign DNA and cleaves it, while such sites in the host DNA are protected by methylation. Prokaryotic viruses have developed a number of strategies to overcome this host defense. The simplest anti-restriction ...

Added: December 20, 2018

Deriving Non–homogeneous DNA Markov Chain Models by Cluster Analysis Algorithm Minimizing Multiple Alignment Entropy

Borodovsky M., Peresetsky A., Computers and Chemistry 1994 Vol. 18 No. 3 P. 259–268

Non-homogeneous Markov chain models can represent biologically important regions of DNA sequences. The statistical pattern that is described by these models is usually weak and was found primarily because of strong biological indications. The general method for extracting similar patterns is presented in the current paper. The algorithm incorporates cluster analysis, multiple alignment and entropy minimization. ...

Added: April 20, 2018

Теория массового обслуживания

Kashtanov V., Ivchenko G., Коваленко И. Н., М.: Либроком, 2012.

В настоящем пособии в доступной для первоначального изучения форме излагаются элементы основных направлений теории массового обслуживания --- раздела теории вероятностей, изучающего системы, предназначенные для обслуживания массового потока требований случайного характера. Представлена общая характеристика систем массового обслуживания; выделены такие разделы теории, как асимптотические методы, приоритетные системы, статистика систем массового обслуживания и моделирование систем массового обслуживания. Второе издание ...

Added: October 18, 2012