The Russian Drug Reaction Corpus and Neural Models for Drug Reactions and Effectiveness Detection in User Reviews

E. Tutubalina; Мифтахутдинов З.; Sakhovskiy A.; Malykh V.; S. I. Nikolenko

doi:10.1093/bioinformatics/btaa675

Publications

?

The Russian Drug Reaction Corpus and Neural Models for Drug Reactions and Effectiveness Detection in User Reviews

Bioinformatics. 2021. Vol. 37. No. 2. P. 243–249.

Tutubalina E., Алимова И. С., Мифтахутдинов З., Sakhovskiy A., Malykh V., Nikolenko S. I.

Drugs and diseases play a central role in many areas of biomedical research and healthcare. Aggregating knowledge about these entities across a broader range of domains and languages is critical for information extraction (IE) applications. To facilitate text mining methods for analysis and comparison of patient’s health conditions and adverse drug reactions reported on the Internet with traditional sources such as drug labels, we present a new corpus of Russian language health reviews.

The Russian Drug Reaction Corpus (RuDReC) is a new partially annotated corpus of consumer reviews in Russian about pharmaceutical products for the detection of health-related named entities and the effectiveness of pharmaceutical products. The corpus itself consists of two parts, the raw one and the labeled one. The raw part includes 1.4 million health-related user-generated texts collected from various Internet sources, including social media. The labeled part contains 500 consumer reviews about drug therapy with drug- and disease-related information. Labels for sentences include health-related issues or their absence. The sentences with one are additionally labeled at the expression level for identification of fine-grained subtypes such as drug classes and drug forms, drug indications and drug reactions. Further, we present a baseline model for named entity recognition (NER) and multilabel sentence classification tasks on this corpus. The macro F1 score of 74.85% in the NER task was achieved by our RuDR-BERT model. For the sentence classification task, our model achieves the macro F1 score of 68.82% gaining 7.47% over the score of BERT model trained on Russian data.

Research target: Medical Biotechnologies

Priority areas: engineering science

Language: English

Full text

DOI

Keywords: natural language processing information extraction deep learning

Publication based on the results of:

Development of Mathematical Models and Methods for Recommender Systems and Natural Language Processing (2020)

Unlocking Stress Coping Mechanisms: Implications for Salivary Antioxidant Defense and Trace Element Homeostasis

Alshanskaia E., Journal of Molecular Neuroscience 2026 Vol. 76 Article 19

This study investigates the relationship between stress coping ability, salivary antioxidant capacity (AOC), and trace element concentrations, focusing on zinc (Zn) and potassium (K). A cohort of 73 participants, divided into groups based on stress coping ability (SCA) (“adaptive”, “intermediate”, and “maladaptive”), underwent cognitive tasks while physiological and behavioral data were collected. Saliva samples were ...

Added: February 2, 2026

Method of Critical Set construction for Successive Cancellation List Decoder of Polar Codes Based on Deep Learning of Neural Networks

Котов Ф. И., Timokhin I., Ivanov F., , in: 2023 XVIII International Symposium Problems of Redundancy in Information and Control Systems (REDUNDANCY).: IEEE, 2023.

The Successive Cancellation List (SCL) algorithm is a widely used decoding technique in communication systems. However, constructing the critical set for SCL decoding is a challenging task, as it requires a large number of computations and can lead to significant decoding delays. In this paper, a new approach to critical set construction for SCL decoding ...

Added: January 26, 2026

Optimization of Gas Permeability in PDMS Microfluidic Chips for Organ-on-Chip Modeling

I. A. Khaustov, Yu. A. Safronova, O. E. Chebotareva et al., Applied Biochemistry and Microbiology 2025 Vol. 61 No. 9 P. 1753–1759

Polydimethylsiloxane (PDMS) remains one of the most popular materials for microfluidic chips, but its high gas permeability can lead to the formation of gas bubbles in microchannels, which hampers the reproducibility of experiments in organ-on-a-chip systems. The dependence of the gas permeability of PDMS on the ratio of base to hardener (2.5 : 1, 5 : 1, and ...

Added: December 22, 2025

Prediction of protein-protein interactions using point transformer and spherical Convex Hull graphs

David Arteaga, Poptsova M., Computational and Structural Biotechnology Journal 2026 Vol. 31 P. 82–93

Accurate predictions and large-scale identification of protein-protein interactions (PPIs) are crucial for understanding their inherent biological mechanisms and protein functions in virtually all biological processes. Nowadays, graph-based deep learning models have made significant contributions in modeling proteins with physicochemical and geometric features. However, most of these models rely on conventional graph construction methods, such as ...

Added: December 22, 2025

Validation of B.Well PRO-33 oscillometric blood pressure monitor for professional office use in the general population in accordance with Amendment 2 of the Standard 81060-2:2018 by the International Organization for Standardization (ISO 81060-2:2018/AMD 2:2024)

Posnenkova O., Simonyan M., Akimova N., Blood Pressure Monitoring 2025 No. 11

We aimed to evaluate the accuracy of B.Well PRO-33 (B.Well, Widnau, Switzerland) oscillometric device for professional office measurement of blood pressure (BP) on the upper arm in the general population in accordance with the Amendment 2.2024-01 of the Standard 81060-2:2018 by the International Organization for Standardization (ISO 81060-2:2018/AMD 2:2024). Study participants were recruited according to ...

Added: December 11, 2025

Кодекс этики в сфере искусственного интеллекта в медицине и здравоохранении

Абрамова А. В., Белоусова Е. Н., Ватюков С. Е. et al., Проблемы стандартизации в здравоохранении 2025 № 5-6 С. 3–14

The improvement of artificial intelligence (AI) technologies and their rapid integration into the socially and economically significant medical industry create broad prospects for ensuring accessibility and quality of medical care, while at the same time creating new challenges related to the safety and ethical risks of using innovative solutions. This creates the need to develop ...

Added: December 7, 2025

Comparative Analysis of Lipid Metabolism in Trophoblast Subpopulations in Preeclampsia and In Vitro Hypoxia Model

Ivan Antipenko, Evgeny Knyazev, Timur Kulagin et al., Frontiers in Molecular Biosciences 2025 Vol. 12 Article 1731126

Preeclampsia is a leading cause of maternal and perinatal morbidity associated with systemic lipid metabolism disturbances, yet the underlying molecular mechanisms remain incompletely understood. In this study, we integrated single-cell RNA-seq data from preeclamptic placentas with an in vitro hypoxia model to analyze gene expression changes across distinct trophoblast subpopulations. While all trophoblast lineages exhibited ...

Added: December 2, 2025

Determining the boundary of dynamical chaos in the generalized Chirikov map via machine learning

Чернышов Д. П., Satanin A., Shchur L., / Series arXiv "math". 2025.

We investigate the boundary separating regular and chaotic dynamics in the generalized Chirikov map, an extension of the standard map with phase-shifted secondary kicks. Lyapunov maps were computed across the parameter space (K,K(α, τ)) and used to train a convolutional neural network (ResNet18) for binary classification of dynamical regimes. The model reproduces the known critical ...

Added: November 21, 2025

Placenta-on-a-Chip Microfluidic Model: Optimization of Perfusion Conditions

S. Yu. Paul, Yu. A. Safronova, O. E. Chebotareva et al., Applied Biochemistry and Microbiology 2025 Vol. 61 No. 7 P. 1369–1378

The design and development of a placenta-on-a-chip model is of great importance for various fields of cell biology, especially in studies of the molecular mechanisms of disease pathogenesis and the action of potential drugs. Currently, milling technique is most commonly used to fabricate microfluidic devices from thermoplastics. However, this technology leads to the formation of ...

Added: November 19, 2025

Объективация болезни: феномен реификации в цифровой психиатрии

Ugleva A. V., Вопросы философии 2025 № 11 С. 112–123

The article focuses on the phenomenon of reification in digital psychiatry. The author highlights that AI technologies exacerbate the problem of translating complex culturally-conditioned psychiatric constructs into formal mathematical structures, which creates an illusion of objectivity and impedes the development of personalized medical care. The main objective of the article is to minimize negative consequences ...

Added: November 6, 2025

ELOVL5 Regulates Ferroptosis in Breast Cancer Cells

K. V. Klycheva, A. V. Razumovskaya, A. D. Shatsillo et al., Doklady Biochemistry and Biophysics 2025

Today, breast cancer (BC) occupies a leading position in prevalence and mortality from oncologi- cal diseases among the female population worldwide. Ferroptosis is a special type of cell death associated with peroxidation of intracellular lipids. It is a promising option for the therapy of BC resistant to traditional meth- ods of treatment. The ELOVL5 gene, ...

Added: October 29, 2025

Artificial Neural Networks and Machine Learning. ICANN 2025 International Workshops and Special Sessions: 34th International Conference on Artificial Neural Networks, Kaunas, Lithuania, September 9–12, 2025, Proceedings, Part V

Cham: Springer, 2025.

This book constitutes the refereed proceedings of 34th International Workshops which were held in conjunction with the 34th International Conference on Artificial Neural Networks and Machine Learning, ICANN 2025, held in Kaunas, Lithuania, September 9–12, 2025. The 20 full papers and 8 abstracts included in this workshop volume were carefully reviewed and selected from 42 submissions. ...

Added: September 29, 2025

A novel method for multiplex protein biomarker analysis of human serum using quantitative MALDI mass spectrometry

Taraskin A., Konstantin K. Semenov, Lozhkov A. et al., Journal of Pharmaceutical and Biomedical Analysis 2022 Vol. 210 Article 114575

In this work, we have extended our previously proposed approach for determining protein concentrations in human serum (using MALDI-TOF mass spectrometry) to include simultaneous analysis of several proteins associated with acute inflammation (alpha-2-macroglobulin, fetuin-A, serum amyloid A1). This technique can be used to diagnose systemic inflammation and provides results in 4–5 h. The developed approach ...

Added: September 23, 2025

Uncertainty estimation for quantitative agarose gel electrophoresis of nucleic acids

Konstantin Semenov, Taraskin A., Yurchenko A. et al., Sensors 2023 Vol. 23 No. 4 Article 1999

This paper considers the evaluation of uncertainty of quantitative gel electrophoresis. To date, such uncertainty estimation presented in the literature are based on the multiple measurements performed for assessing the intra- and interlaboratory reproducibility using standard samples. This paper shows how to estimate the uncertainty in cases where we cannot study scattering components of the ...

Added: September 23, 2025

Transcriptome Analysis of Bone Marrow Plasma Cells in Multiple Myeloma Patients before Treatment

Shaitan A., Biochemistry. Biokhimiia 2025 Vol. 19 No. 1 P. 98–108

Multiple myeloma (MM) is a malignant lymphoproliferative disorder associated with accumulation of terminally differentiated B lymphocytes (plasma cells) in the bone marrow, monoclonal expression of pathologic immunoglobulin, anemia, renal damage, hypercalcemia, and bone lesions. Despite considerable attention to the study of ММ pathogenesis and the development of new drugs, this disease remains incurable. Omics technologies ...

Added: September 18, 2025

Rewriting the Rules: LLMs Vs. Traditional ML in University Admissions

Chepikov I., Karpov I., , in: 26th International Conference, AIED 2025, Palermo, Italy, July 22–26, 2025, Proceedings, Part I. Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium, Blue Sky, and WideAIED.: Springer, 2025. P. 352 – 358.

Modern LLM models such as BERT, ChatGPT, DeepSeek have shown great potential in solving various tasks, including text classification, text generation, analysis and summary of documents. In this paper, we show that these models close to classical ML approaches based on decision trees not only in text processing, but also in processing classical tabular data ...

Added: September 4, 2025

Программное обеспечение для анализа движений захвата и переноса рукой: возможности применения в когнитивных и нейрофизиологических исследованиях

Viazmin A., Чапанова М. Р., Морозов М. С. et al., Нервно-мышечные болезни 2025 Т. 15 № 2 С. 28–36

Objective. To present Kinematic 4, a software tool designed for automated analysis of reach-to-grasp and object transport movements using data obtained from motion capture systems. Materials and Methods. The software was developed in Python and implements algorithms for automatic detection of key temporal events in motor actions. Initially validated in MATLAB, the algorithmic framework was adapted ...

Added: September 2, 2025

Developing an Approach for Automated Data Collection and Mining Using Web Scraping Techniques and Large Language Models: A Case Study on Extracting Technology Readiness Level Assessments

F. M. Grozovskiy, I. V. Loginova, Automatic Documentation and Mathematical Linguistics 2025 Vol. 59 No. 4 P. 269–278

The paper proposes an approach to the automated extraction and structuring of information from text, combining web scraping for data collection from online sources with a large language model for subsequent data mining. As a case study, texts from news publications on technology readiness levels from the CNews website were chosen to test the developed methodology in a ...

Added: August 25, 2025

Influence of Microplastics on Manifestations of Experimental Chronic Colitis

Zolotova N., Silina M., Dzhalilova D. et al., - 2025 Vol. 13 No. 8 Article 701

Environmental pollution with microplastics (MPs) can have a negative impact on human health. Certain findings point to the relationship between MP and the development of inflammatory bowel diseases (IBD). We investigated the effect of MP consumption on the severity of chronic colitis in male C57BL/6 mice. The MP effect was modeled by drinking water consumption ...

Added: August 25, 2025

Heterologous production of antimicrobial peptides in yeast allows for massive assessment of the activity of DNA-encoded antimicrobials in situ

Pipiya S. O., Ivanova A. O., Mokrushina Y. A. et al., Acta Naturae 2025 Vol. 17 No. 1 P. 71–77

Antibiotic resistance threatens global healthcare. In clinical practice, conventional antibiotics are becoming gradually less effective. Moreover, the introduction of new antimicrobial agents into clinical practice leads to the emergence of resistant pathogenic strains within just a few years. Hence, the development of platforms for massive creation and screening of new antimicrobial agents is of particular ...

Added: August 10, 2025

Deep learning deciphers the related role of master regulators and G-quadruplexes in tissue specification

Artem B., Andreasyan A., Konovalov D. et al., Scientific Reports 2025 Vol. 15 Article 23119

G-quadruplexes (GQs) are non-canonical DNA structures encoded by G-flipons with potential roles in gene regulation and chromatin structure. Here, we explore the role of G-flipons in tissue specification. We present a deep learning-based framework for the genome-wide G-flipon predictions across 14 human tissue types. The model was trained using high-confidence experimental maps of GQ-forming sequences ...

Added: August 8, 2025

FUNCTIONAL ANALYSIS OF BIPARTITE NRF2 ACTIVATORS THAT OVERCOME FEEDBACK REGULATION FOR AGE-RELATED CHRONIC DISEASES

Hushpulian D., Kaidery N. A., Soni P. et al., Redox Biology 2025 Vol. 86 Article 103794

Activating Nrf2 with small molecules is a promising strategy for countering aging, oxidative stress, inflammation, and various disorders, including neurodegeneration. The primary regulator of Nrf2 protein stability is Keap1, a redox sensor protein and an adapter in the Cullin III ubiquitin ligase complex, which labels Nrf2 for proteasomal degradation. The canonical Nrf2 activators either chemically ...

Added: August 1, 2025

AI in drug development: advances in response, combination therapy, repositioning, and molecular design

Shaitan A., Science China Information Sciences 2025 Vol. 68 No. 7 Article 170102

Artificial intelligence (AI) is revolutionizing the field of drug development, particularly in addressing key challenges such as drug response prediction, drug combination design, drug repositioning, and drug molecule generation. Traditional drug discovery is hindered by long timelines, high costs, and low success rates, necessitating innovative technologies to accelerate the process. AI technologies, such as deep ...

Added: June 25, 2025

An Approach to Finding a Robust Deep Learning Model

Boldyrev A., Ratnikov F., Shevelev A., IEEE Access 2025 Vol. 13 P. 102390–102406

The rapid development of machine learning (ML) and artificial intelligence (AI) applications requires the training of a large numbers of models. This growing demand highlights the importance of training models without human supervision, while ensuring that their predictions are reliable. In response to this need, we propose a novel approach for determining model robustness. This approach, supplemented with a ...

Added: June 15, 2025