Big Data in Bioinformatics

Nazipova N. N.; E. Isaev; V. Kornilov; D. Pervukhin; Morozova A.; A. A. Gorbunov; Ustinin M. N.

doi:10.17537/2018.13.t1

Publications

?

Big Data in Bioinformatics

Mathematical Biology and Bioinformatics. 2018. No. 13. P. t1–t16.

Nazipova N. N., Isaev E., Kornilov V., Pervukhin D., Morozova A., Gorbunov A. A., Ustinin M. N.

Sequencing of the human genome began in 1994. It took 10 years of work by many scientific teams to get a rough sequence of human DNA. Modern sequencing technologies allow you to get the genome of a specific person in a few days. We discuss the success of modern bioinformatics associated with the emergence of high-performance sequencing platforms, which not only contributed to the expansion of the capabilities of various areas of biology and other related Sciences, but also gave rise to the phenomenon of big data. The article substantiates the need to develop new technologies and methods for organizing storage, management, analysis and visualization of big data. Modern bioinformatics is faced not only with the problem of big data, but also with a huge variety of processing and presentation methods, the simultaneous existence of various software tools and data formats. We discuss ways to solve these problems, in particular by using the experience of working with big data from other areas of modern life, such as network analysis and business data analysis. New database management systems other than relational ones will help solve the problem of storing big data and ensuring acceptable search query execution time. New programming technologies, such as generalized programming and visual programming, are designed to solve the problem of the diversity of genomic data formats and provide the ability to quickly create your own scripts for data processing.

Research target: Biology Computer Science

Priority areas: IT and mathematics business informatics

Keywords: большие данные bioinformatics

Total conditional complexity of certain objects

Vereshchagin N., Information and Computation 2026 Vol. 308 P. 1–12

The fine approach to measure information dependence is based on the total conditional complexity CT( y |x), which is defined as the minimal length of a total program that outputs y on the input x. It is known that the total conditional complexity can be much larger than the plain conditional complexity. Such strings x, y are defined ...

Added: February 14, 2026

Diffusion models for synthetic tabular data generation

Hushchyn M., Telesheva E., Doklady Mathematics 2025 No. 527 P. 388–399

he problem of generating high-quality synthetic data is crucial for many data science tasks. A generated dataset can cut the costs on the augmentation of the existing data with additional instances, for example, in physics, or help with its privacy protection, for instance, in banking. However, generating a tabular dataset is challenging, as the data ...

Added: February 12, 2026

Не-эволюционный взгляд на поведение учёного

Plusnin J., Управление наукой: теория и практика 2025 Т. 7 № 2 С. 199–209

The problem of professional motivations of scientists’ activity, their style of behavior in science and the preceding choice of a life path is discussed from the perspective of the concept of invariance of psychobiological bases of behavior. The author substantiates the assertion that the nature of the scientist (their personality type, behavior style and motivational ...

Added: February 12, 2026

Real-Bogus Classification for ZTF Data Releases: Two Approaches

Semenikhin n., Kornilov M., Lavrukhina A. et al., Communications in Computer and Information Science 2026 Vol. 2641 P. 211–219

We considered two fundamentally different approaches to real-bogus classification within the Zwicky Transient Facility survey data. The first approach is based on neural networks that take sequences of object images as input. The second approach uses features extracted from light curves and classical machine learning methods. Several models for both approaches were tested. Quality metrics ...

Added: February 12, 2026

Proglacial successions of springtail assemblages (Collembola) along retreating glaciers in Kabardino-Balkaria, Greater Caucasus, Russia

Антипова М. Д., Бушуева И. С., Бабенко А. Б., Nature conservation research. Zapovednaâ nauka 2026 Vol. 11 No. 1 P. 71–92

Since the end of the Little Ice Age, glacier retreat has been recorded globally, with its rates steadily increasing. Glacier forelands serve as convenient areas for studying the patterns of biotic community formation during primary succession. Collembola (hereinafter – springtails) typically play key roles in primary successions, being among the first colonists of territories newly ...

Added: February 12, 2026

Механизмы социальной самоорганизации

Plusnin J., Идеи и идеалы 2025 Т. 17 № 1, ч.1 С. 105–128

The author proposes a hypothesis of two types of mechanisms of social self-organization: its launch and maintenance of structural integrity. These are two fundamentally different mechanisms. Social self-organization requires four mandatory conditions: (1) a set of interacting elements (individuals), homogeneous in origin and creating, due to their common habitat, a behavioral population system; (2) the ...

Added: February 12, 2026

Проблемы достоверности пользовательских оценок и отзывов на маркетплейсах: системный подход

Полежаева Я. В., Popov V., Бизнес-информатика 2025 Т. 19 № 24 С. 26–41

User ratings and reviews on marketplaces are subject to systematic distortions, creating serious risks for e-commerce participants and reducing the efficiency of market mechanisms. This study presents a comprehensive analysis of information distortion problems, covering the process from rating formation to its systematic accounting. The aim of the work is to systematize factors of information distortion on marketplaces and ...

Added: February 11, 2026

Development of a Language Model for Automated Classification of English-Language Scientific Articles by SRSTI Codes

Zunin V., Afonin A. I., Anoshin V. I. et al., Automatic Documentation and Mathematical Linguistics 2025 Vol. 5 No. 59 P. 287–293

The development of an artificial intelligence-based language model for classifying English-language scientific articles by SRSTI codes is described. This improves the processes of reviewing and indexing scientific publications. A pre-processed dataset of scientific articles was used for training and testing the models. An architecture for cascade classification was developed, and the performance of models with ...

Added: February 11, 2026

Generation of Synthesizable Verilog Code From Natural Language Specifications

Yashchenko D. S., Romanov A., Ziazetdinov A.A. et al., IEEE Access 2026 Vol. 14 P. 4990–5001

This study presents a method for generating synthesizable Verilog code for digital integrated circuits directly from natural-language specifications. The approach combines large language models with parameter-efficient fine-tuning (specifically, Low-Rank Adaptation and Quantized Low-Rank Adaptation) together with a specialized corpus of specification-code pairs that covers common design patterns and varying task complexity. The pipeline includes automated ...

Added: February 11, 2026

Воздействие загрязнения почв на интенсивность разложения хвойного опада

Трифонова Т. А., Gorbacheva A., Буйволов Ю. А. et al., Агрохимический вестник 2025 С. 61–66

Для оценки антропогенного воздействия на лесные экосистемы особо охраняемых природных территорий проведено сравнение разложения растительного опада в природно-историческом парке «Кузьминки-Люблино» (г. Москва) с разложением растительного опада в Приокско-Террасном заповеднике (Московская обл.). Измерения скорости биоразложения хвойного опада проводили по убыли биомассы при годовой экспозиции почвенных конвертов в различных типах леса. Дисперсионный анализ выявил статистически значимое по ...

Added: February 10, 2026

Application of MIMO technology in wideband millimeter range wireless communications systems

Tiraspolsky S.A., Ermolayev V. T., Flaksman A. G. et al., Radioelectronics and Communications Systems 2011 Vol. 54 P. 219–226

A concept of using MIMO technology in millimeter range wireless communications systems with orthogonal frequency division multiplexing is considered. The concept is based on dividing transmitting and receiving multi-element antenna arrays into separate sub-arrays with analogue radiation pattern shaping and on using two most powerful space sub-channels for information transmission. Sequence and structure of transmitted ...

Added: February 10, 2026

mmWave SVD-based beamformed MIMO communication systems

Sergey Tiraspolsky, Jeon B., Kim J. et al., Proceedings of the 7th IEEE conference on Consumer communications and networking (CCNC’2010) 2010 P. 834–838

This paper provides concept of data transmission protocol for millimeter wave (mmWave) wireless systems operating in Non-Line-of-Sight environment. This concept is designed to provide an effective and practical functioning of Multiple-Input Multiple-Output (MIMO) transmission mode that exploits combination of Singular Value Decomposition (SVD) of channel matrix and non-adaptive beamforming. The proposed protocol reduces complexity of ...

Added: February 10, 2026

Selective interference cancellation using Kalman filtering

Tiraspolsky S., Rubtsov A., Pudeyev A. et al., Proceedings of the 2006 3rd International Symposium on Wireless Communication Systems, IEEE 2006 P. 21–24

In present paper we have investigated a co-channel interference cancellation technique based on the tracking a limited number of strongest interferers only. With the assumption of synchronous base stations operation with overlapping but different training signals (pilots). Kalman filtering may be used for interfering channels estimation and further calculation of interference correlation matrix. This correlation ...

Added: February 10, 2026

Mobile WiMAX - Deployment Scenarios Performance Analysis

Tiraspolsky S., Malstev A., Rubtosv A. et al., Proceedings of the 2006 3rd International Symposium on Wireless Communication Systems, IEEE 2006 P. 353–357

In this paper, dynamic system level simulation methodology of mobile WiMAX (IEEE Std 802.16e) is described. The system level simulations scenarios (channel models, pathloss and shadow fading, sectorization, frequency reuse planning, system loading, etc) will be introduced. Evaluated performance of mobile WiMAX system such as signal-to-interference + noise ratio distributions, spectral efficiency and system outage ...

Added: February 10, 2026

Эффективность применения грассмановской диаграммообразующей схемы в MIMO системах связи

Тираспольский С.А., Червяков А. В., Труды Научной конференции по радиофизике, ННГУ, 2004 2004 С. 169–171

Диаграмообразование (ДО) в MIMO системах (multiple-input multiple-output systems), одновременно использующих несколько приемопередатчиков на обоих концах линии связи, является достаточно простым способом для повышения пропускной способности и увеличения ОСШ на приемном конце. Для этого в большинстве ранее предлагавшихся методов было необходимо знание на передатчике канальной матрицы или части ее SVD разложения, что требует значительной нагрузки на ...

Added: February 10, 2026

High-resolution capability of adaptive antenna arrays for communication systems

S.A. Tiraspolsky, Gerebryakov G. V., Журнал радиоэлектроники 2002 No. 7

In this paper we investigate comparison methods of different geometric configurations of adaptive antenna arrays for communications on purpose to estimate directions-of-arrival (DOA) of several external signals. The investigated antenna configurations have four elements and eleven wavelengths array size. The best high-resolution algorithm and the best array configuration are defined by numerical simulations. ...

Added: February 10, 2026

Применение адаптивных антенных решеток для увеличения скорости передачи информации

С.А.Тираспольский, Ермолаев В. Т., Флаксман А. Г. et al., Труды Научной конференции по радиофизике, ННГУ, 2002 2002 С. 22–28

В данной работе рассматривается принцип передачи информации и теоретически исследуется пропускная способность MIMO системы в условиях случайного канала распространения радиоволн, обсуждаются различные алгоритмы распределения мощности передатчика по параллельным ортогональным пространственным подканалам. ...

Added: February 10, 2026

Multiple adaptive recursive array for multipath environment

S. Tiraspolsky, Sellone F., Serebryakov G., Proceedings of the International Conference on Electromagnetics in Advanced Applications (ICEAA 01) 2001 P. 691–696

In a wireless communication system, signals sent into the channel interact with the environment in a very complex way. Thereby transmitted signals may be subject to many forms of degradation among which there are causes of multipath propagation: • Reflections due to obstacles with the size greater than a wavelength; • Refractions due to the ...

Added: February 10, 2026

Эффективность линейной обработки сигналов в системах связи в условиях многолучевого ионосферного канала декаметрового диапазона

Тираспольский С.А., Флаксман А. Г., Ермолаев В. Т. et al., Известия высших учебных заведений. Радиоэлектроника 2016 № 1 С. 8–14

Рассмотрены системы связи декаметрового диапазона, работающие в условиях многолучевого ионосферного пространственного канала. С помощью имитационного моделирования на физическом уровне исследованы основные характеристики системы (вероятность битовой и блоковой ошибки, про пускная способность). Показано, что в условиях частотно-селективного канала в полосе 3 кГц линейный алгоритм эквализации обеспечивает высокую эффективность подавления межсимвольной помехи для всех скоростей передачи данных, кроме самой высокой. ...

Added: February 10, 2026

UVIP: Model-Free Approach to Evaluate Reinforcement Learning Algorithms

Belomestny D., Levin I., Naumov A. et al., Journal of Optimization Theory and Applications 2026 Vol. 208 Article 89

Policy evaluation is an important instrument for the comparison of different algorithms in Reinforcement Learning (RL). However, even a precise knowledge of the value function Vπ corresponding to a policy π does not provide reliable information on how far the policy π is from the optimal one. We present a novel model-free upper value iteration ...

Added: February 10, 2026

Ecosystem–Atmosphere Exchange of CO2 in Ombrotrophic and Mesotrophic Peatlands in the Taiga Zone of European Russia and West Siberia

Mamkin V., Avilov V., Dmitrichenko A. et al., Global Biogeochemical Cycles 2026 Vol. 40 No. 1 P. 1–21

Northern peatlands (>50°N) account for approximately 70% of the global peatland area and play a key role in the global carbon cycle. However, their role as long‐term carbon sinks is vulnerable to modern climate warming at high latitudes. Future climate predictions require data on how peatlands respond to observed changes in global environmental parameters, particularly ...

Added: February 8, 2026

Основы компьютерной графики

Korolev D., СПб.: Лань, 2026.

Учебное пособие состоит из четырех разделов, где рассматриваются физические основы, аналого-цифровое преобразование графики, сжатие графики и видео, устройства ввода и вывода графической информации; книга повторяет структуру и содержание теоретической части курса. Основной подход —- систематизация школьных знаний и формирование целостной картины работы с графикой и видео «изнутри». На различных примерах показываются элегантные инженерные решения в ...

Added: February 7, 2026

Роль женщин-ученых в развитии науки и образования: сборник научных статей участников Международного форума женщин-ученых, посвященного 105-летию БГУ

Мн.: РИВШ, 2026.

В сборнике представлены научные статьи женщин-ученых и преподавателей, участников Международного форума женщин-ученых, который был организован к 105-летию Белорусского государственного университета первичной организацией ОО «Белорус ский союз женщин» БГУ. В сборник вошли статьи представителей Беларуси, России, Китая, Кыргызстана, Азербайджана, Индии, Ирака, специалистов в области биологии, дизайна, журналистики, культурологии, медицины, менеджмента, педагогики, психологии, социологии, физики, филологии, философии, ...

Added: February 6, 2026

Multimodal graph, surface, and language-based model for protein protein interaction prediction

Arteaga Moreano B. D., Poptsova M., Scientific Reports 2026 No. 16 Article 4772

Accurate prediction of protein-protein interactions (PPIs) is fundamental to understanding biological processes and disease mechanisms. While deep learning offers a powerful alternative to costly experimental methods, existing approaches often overlook critical protein-surface information and rely on simplistic feature fusion techniques, thereby limiting performance. To address this, we introduce GSMFormer-PPI, a novel multimodal framework that integrates ...

Added: February 4, 2026