• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Creating and Using Synthetic Data for Neural Network Training, Using the Creation of a Neural Network Classifier of Online Social Network User Roles as an Example
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 14, 2026
Resource Race and Green Transition: Three Unexpected Conclusions from Foresight Centres Research on Climate and Poverty
Beneath the surface of green energy—which most people associate with solar panels, electric vehicles, and reduced CO2 emissions—lies a complex web of geopolitical interests, international inequality, and resource constraints. Researchers from the Laboratory for Science and Technology Studies (LST) at the HSE ISSEK Foresight Centre have published a series of articles in leading international journals on hidden and overt conflicts surrounding critically important metals and minerals, as well as related processes in the energy sector.
May 13, 2026
Immersion in Second Language Environment Influences Bilinguals Perception of Emotions
Researchers at the Cognitive Health and Intelligence Centre at the HSE Institute for Cognitive Neuroscience have discovered how bilingual individuals process emotional words in their native (first) and non-native (second) languages. It was found that the link between word meaning and bodily sensations is weaker in a second language than in a first language. However, the more a person is immersed in a language environment, the smaller this difference becomes. The article has been published in Language, Cognition and Neuroscience.
May 12, 2026
‘Any Real-Economy Company Can Use Our Products
The HSE Centre for Financial Research and Data Analytics combines fundamental and applied work, including in areas unique to Russia such as the connection between sentiment in the media and social networks and financial markets. The HSE News Service spoke with the centre’s director, Professor Tamara Teplova, about its work.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Creating and Using Synthetic Data for Neural Network Training, Using the Creation of a Neural Network Classifier of Online Social Network User Roles as an Example

P. 412–421.
Rabchevskiy A., Yasnitsky L.
Language: English
Full text
DOI
Text on another site
Keywords: social networkDataset creationDataset synthesisSynthetic data

In book

Digital Science: DSIC 2021
Vol. 381. , Switzerland: Birkhauser/Springer, 2022.
Similar publications
Model for Assessing the Need to Involve Users of Social Networks in a Healthy Lifestyle and Giving up Bad Habits According to the Data of a Social Network
Varnavsky A., , in: 26th International Conference, DAMDID/RCDL 2024, Nizhny Novgorod, Russia, October 23–25, 2024, Revised Selected Papers. Data Analytics and Management in Data Intensive Domains. (CCIS, volume 2641).: Springer, 2026. P. 239–252.
An urgent task is to preserve and maintain the health of the country’s population, including through the promotion of a healthy lifestyle. Since social networks are very popular, especially among young people, it is possible to promote a healthy lifestyle on their basis. Despite the existing research on the influence of social networks on user ...
Added: May 7, 2026
Assessing the Big Data Value: Approaches and Methods
Maltseva S. V., , in: Информатика и прикладная математика: Материалы X Международной научно-практической конференции (08.10 - 11.10.2025 г.)Т. 1: Сборник материалов часть 1.: Алматы: Институт информационных и вычислительных технологий КН МНВО РК, 2025.
Modern technological capabilities for obtaining data make them an important resource. Data analytics, development of products and services that actively use big data, implementation of the concept of data-driven organization make it necessary further development of methods for assessing the value, usefulness and cost of big data. Existing and promising methods, including the influence of ...
Added: March 3, 2026
Моделирование информационного сетевого взаимодействия в киберсоциальных системах
Maltseva S. V., Голубцов П. В., Барахнин В. Б., Вычислительные технологии 2026 Т. 31 № 1 С. 5–22
The issues of macro-level monitoring of the manufacturing system in the implementation of the concepts of Industry 4.0 and 5.0 based on the study of information flows in manufacturing network structures are considered. The numerical models of three types of network interaction, that taking into account the influence of the number of objects, external influences, ...
Added: February 26, 2026
Semi-automatic annotation of brain vessels in magnetic resonance angiography images
Bernadotte A, Elfimov N., Menshikov I., Scientific data 2025 Vol. 13 No. 41
Accurate segmentation of brain vessels in magnetic resonance angiography (MRA) is essential for surgical procedures. Neural networks are powerful tools for medical image segmentation, but their development requires well-annotated datasets. However, publicly available MRA datasets with detailed vessel annotations are scarce. We present a dataset of 100 manually annotated brain MRA images from the IXI ...
Added: February 25, 2026
Фундаментальная модель для временных рядов и как ее (не) обучать на синтетике
Temirkhanov A., Костромина А. М., Цымбой О. А. et al., Доклады Российской академии наук. Математика, информатика, процессы управления (ранее - Доклады Академии Наук. Математика) 2025 Т. 527 № S С. 485–494
The industry is rich in cases when we are required to make forecasting for large amounts of time series at once. However, we might be in a situation where we can not afford to train a separate model for each of them. Such issue in time series modeling remains without due attention. The remedy for ...
Added: February 24, 2026
Городские паблики социальной сети «ВКонтакте»: специфика привлечения аудитории, особенности подачи информации
Дементьева К. В., Вестник Томского государственного университета. Филология 2021 № 73 С. 287–310
The article analyzes the public pages of provincial cities – centers of regions, which are information resources based on user-generated content, and reveals their role and impact on public opinion in modern society. The author’s definition of the city’s public page is given; the existing typologies are considered. Also, at this stage, the content of ...
Added: October 31, 2025
Enhancing bankruptcy prediction efficiency using synthetic data
Elizaveta V. Lashkevich, Business Informatics 2025 Vol. 19 No. 3 P. 22–47
The firm financial insolvency prediction is crucial for investors, creditors, and regulators. However, access to high-quality, balanced data for model training is often limited due to privacy concerns, information scarcity, or financial reporting characteristics. This paper explores the potential of synthetic data generation techniques to increase minority class instances in unbalanced datasets and thereby potentially improve ...
Added: September 15, 2025
AGDES: a Python package and an approach to generating synthetic data for differential equation solving with LLMs
Vladimir Zakharov, Anton Surkov, Sergei Koltcov, Procedia Computer Science 2025 Vol. 258 P. 1169–1178
The rapid development of large language models (LLMs), including their successful application to solving mathematical problems requiring complex reasoning, presents a potential avenue for using LLMs in solving differential equations. While these equations are currently being solved successfully both numerically and via the symbolic approach, it is possible that fine-tuned LLMs, if they treat solving ...
Added: August 21, 2025
Agent-Based Model of Protest Campaign with Dynamic Network
Petrov A., Sergey Zheglov, Akhremenko A. S., , in: 2024 17th International Conference on Management of Large-Scale System Development (MLSD).: IEEE, 2024. Ch. 1 P. 1–4.
Agent-based models on networks most often assume static networks. However, in some contexts political science provides arguments that the dynamical character of the network should be taken into account. Specifically, during a protest campaign, new social ties between the participants may occur. Here we present an agent-based model of protest campaign and some numerical experiments with it. It is shown that ...
Added: May 1, 2025
Sim4Rec: Flexible and Extensible Simulator for Recommender Systems for Large-Scale Data
Anna Volodkevich, Ivanova V., Vasilev A. et al., , in: Advances in Information Retrieval: 47th European Conference on Information Retrieval, ECIR 2025, Lucca, Italy, April 6–10, 2025, Proceedings, Part IV.: Springer, 2025. P. 425–430.
Simulators for recommender systems are widely used for recommender systems performance evaluation and feedback loop effects analysis. Existing simulators often propose inflexible pipelines, are focused on narrow research tasks, or are not adapted to work with industrial large data volumes. To address these challenges, we developed the Sim4Rec simulation framework. The Sim4Rec models key aspects ...
Added: April 10, 2025
User response modeling in recommender systems: a survey
M. Shirokikh, Shenbin I., Alekseev A. et al., Journal of Mathematical Sciences 2024 Vol. 285 No. 2 P. 255–284
Over the last several decades, recommender systems have become an integral part of both our daily lives and the research frontier at machine learning. In this survey, we explore various approaches to developing simulators for recommendation systems, especially for modeling the user response function. We consider simple probabilistic models, approaches based on generative adversarial networks, ...
Added: November 24, 2024
MedSyn: LLM-based synthetic medical text generation framework
Kumichev G., Blinov P., Kuzkina Y. et al., , in: Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, Proceedings, Part X. LNCS, volume 14950.: Cham: Springer, 2024. P. 215–230.
Generating synthetic text addresses the challenge of data availability in privacy-sensitive domains such as healthcare. This study explores the applicability of synthetic data in real-world medical settings. We introduce MedSyn, a novel medical text generation framework that integrates large language models with a Medical Knowledge Graph (MKG). We use MKG to sample prior medical information for the prompt and generate synthetic ...
Added: November 22, 2024
Watermarking for social networks images with improved robustness through polar codes
Evsyutin O., Ivanov F., Dzhanashia K., IEEE Access 2024 Vol. 12 P. 118154–118168
Protecting ownership of digital content is challenging in today’s online world, especially when sharing content through social networks and instant messengers. One possible solution is the use of watermarking; however, if the watermarking method is not robust enough, the watermark can get damaged or erased during transmission. This study introduces a template-based watermarking method with ...
Added: September 1, 2024
ПРИЛОЖЕНИЕ ПОИСКА, АНАЛИЗА И ПРОГНОЗИРОВАНИЯ ДАННЫХ В СОЦИАЛЬНЫХ СЕТЯХ
Slastnikov S., Zhukova L., Semichasnov I., Информационные технологии и вычислительные системы 2024 № 1 С. 97–108
In this article, we present a web service designed for searching, extracting, and analyzing data from social networks and messengers, demonstrating its application for studying communities within the "VKontakte" social network. The web service enables the identification of typical user profiles within communities, the assessment of emotional sentiment in posts and comments, as well as ...
Added: August 12, 2024
The Role of Synthetic Data in Improving Neural Network Algorithms
Rabchevskiy A., Leonid N. Yasnitsky, , in: 2022 4th International Conference on Control Systems, Mathematical Modeling, Automation and Energy Efficiency (SUMMA).: IEEE, 2022. P. 316–312.
Abstract— This review article describes synthetic data, its applications, and examples of improving neural network algorithms with synthetic data. Using these examples, we show the important role of synthetic data in the improvement of neural network algorithms and the development of artificial intelligence ...
Added: February 15, 2024
Synthesis of Datasets for Neural Networks Based on Expert Knowledge
Rabchevskiy A., Ashikhmin E., Yasnitsky L., , in: Cyber-Physical Systems and Control II.: Springer, 2023. P. 535–544.
The problem of creating datasets for training and testing neural networks is described in the example of the task of social network management. A method of expert dataset synthesis based on experts’ knowledge of the subject area is proposed. The essence of the method lies in the fact that sets are generated randomly within the ...
Added: November 20, 2023
Do Personal Relationships Boost Academic Performance More for Women than for Men?
Dokuka S., Mikhaylova O., Journal of Social and Personal Relationships 2024 Vol. 41 No. 4 P. 720–729
Social integration is known to be positively related to academic performance. It is also well-known to play a different role for (self-identified) men and women. In this paper, we examine the differences seen in the correlations between academic performance and social integration for men and women. Gender was determined on the basis of self-identification. Utilizing ...
Added: September 18, 2023
Social Health and Change in Cognitive Capability among Older Adults: Findings from Four European Longitudinal Studies
Maddock J., Gallo F., Wolters F. et al., Gerontology 2023 Vol. 69 No. 11 P. 1330–1346
Introduction: In this study we examine whether social health markers measured at baseline are associated with differences in cognitive capability and in the rate of cognitive decline over an 11-to-18-year period among older adults and compare results across studies. Methods: We applied an integrated data analysis approach to 16,858 participants (mean age 65 years; 56% ...
Added: September 15, 2023
COVID-19 in social networks: unravelling its impact on youth risk perception, motivations and protective behaviours during the initial stages of the pandemic
Marta Anson, Ksenia Eritsyan, International Journal of Adolescence and Youth 2023 Vol. 28 No. 1 Article 2245012
The study explores the roles of youth prosocial, self-interested and controlled motivations to comply with recommended protective behaviour during the initial stages of the COVID-19 pandemic. We test the interrelations of aware-ness of COVID-19 cases in social network, risk perception, motivation and behaviour, via structural equation modelling on self-reported data from 1,265 undergraduate university students. ...
Added: August 24, 2023
An Ontology-Driven Approach to the Analytical Platform Development for Data-Intensive Domains
Viktor S. Zayakin, Lyadova L. N., Viacheslav V. Lanin et al., , in: Knowledge Discovery, Knowledge Engineering and Knowledge Management: 13th International Joint Conference, IC3K 2021, Virtual Event, October 25–27, 2021, Revised Selected PapersVol. 1718: IC3K: International Joint Conference on Knowledge Discovery, Knowledge Engineering, and Knowledge Management.: Springer, 2023. Ch. 8 P. 129–149.
Added: July 8, 2023
CrowdChecked: Detecting Previously Fact-Checked Claims in Social Media
Hardalov M., Chernyavskiy A., Koychev I. et al., , in: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).: Association for Computational Linguistics, 2022. P. 266–285.
While there has been substantial progress in developing systems to automate fact-checking, they still lack credibility in the eyes of the users. Thus, an interesting approach has emerged: to perform automatic fact-checking by verifying whether an input claim has been previously fact-checked by professional fact-checkers and to return back an article that explains their decision. ...
Added: May 21, 2023
Study of Strategies for Disseminating Information in Social Networks Using Simulation Tools
Usanin A., Zimin I., Elena Zamyatina, , in: Analysis of Images, Social Networks and Texts: 9th International Conference, AIST 2020, Skolkovo, Moscow, Russia, October 15–16, 2020, Revised Selected PapersVol. 12602.: Springer, 2021. P. 303–315.
The paper presents simulation tools for investigation not only the structural characteristics of social networks in order to study information dissem-ination strategies, but also the dynamic characteristics of this process. A feature of this software system is not only the ability to work with virtual social networks, but also with data from real networks. To ...
Added: October 30, 2021
Миграционное поведение студентов российских вузов на основе данных цифровых следов
Gabdrakhmanov N., Орлова В. В., Александрова Ю. К., Вестник Томского государственного университета 2021 Т. 467 С. 106–114
Анализируется образовательная и трудовая миграция 400 тысяч выпускников российских вузов из восьми университетов на основе данных цифровых следов. Выделяются несколько типов последовательного миграционного поведения от школы до университета и далее на рынке труда. На примере нескольких вузов показываются различия в миграционном поведении студентов. ...
Added: October 12, 2021
An Ontology-Based Approach to Social Networks Mining
Viacheslav Lanin, Lyudmila Lyadova, Elena Zamyatina et al., , in: Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge ManagementVol. 2: KEOD.: Lisbon: SciTePress, 2021. P. 234–239.
Added: October 2, 2021
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit