• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Articles
  • Тематическое моделирование для коротких текстов: сравнительный анализ
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 22, 2026
HSE Graduates AI Project Wins at TECH & AI Awards
Daria Davydova, graduate of the HSE Graduate School of Business and Head of the AI Implementation Unit at the Artificial Intelligence Department of Alfa-Bank, received a prize at the TECH & AI Awards. She was awarded for the best AI solution for optimising business processes. The winners were determined as part of the VII Russian Summit and Awards on Digital Transformation (CDO/CDTO Summit & Awards).
May 20, 2026
HSE University Opens First Representative Office of Satellite Laboratory in Brazil
HSE University-St Petersburg opened a representative office of the Satellite Laboratory on Social Entrepreneurship at the University of Campinas in Brazil. The platform is going to unite research and educational projects in the spheres of sustainable development, communications and social innovations.
May 18, 2026
The 'Second Shift' Is Not Why Women Avoid News
Women are more likely than men to avoid political and economic news, but the reasons for this behaviour are linked less to structural inequality or family-related stress than to personal attitudes and the emotional perception of news content. This conclusion was reached by HSE researchers after analysing data from a large-scale survey of more than 10,000 residents across 61 regions of Russia. The study findings have been published in Woman in Russian Society.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Тематическое моделирование для коротких текстов: сравнительный анализ

Социология: методология, методы, математическое моделирование. 2023. № 56. С. 69–112.
Vashchenko V.

The steady increase in the popularity of social media as a means of communication actualizes methodological issues related to processing of short texts with less semantic context than large corpora, which are widely used for training and testing machine learning models for textual data. Topic modeling, an unsupervised machine learning technique aimed at aggregating texts into topic clusters, has many academic and practical applications where information on true groupings of texts is not available. However, the performance of topic modeling algorithms may be limited by requirement of a sufficient semantic context for a high-quality numerical representation of a unit of text, which may not be derived effectively from a short document. This paper discusses 3 different approaches to topic modeling: classical LDA enriched with pre-trained word embeddings, topic modeling based on the BERT transformer model, and a network-based approach to topic modeling using stochastic blockmodels. We compare the performance of the above algorithms on a set of Russian-language comments on TikTok and formally evaluate their performance based on speed and coherence of the resulting topics.

Research target: Sociology (including Demography and Anthropology Media and Communications Computer Science
Language: Russian
DOI
Keywords: анализ текстовых данныхtopic modelingтематическое моделированиеприкладной сетевой анализapplied network analysistextual data analysis
Publication based on the results of:
Development of network analysis in Russia: adaptation of theoretical and methodological approaches and practical application (2024)
Similar publications
Stable On-the-Fly Learning for Dynamic Neural Networks With Delayed Inputs
Kibkalo Vladislav, Chertopolokhov V., Mukhamedov A. et al., IEEE Access 2026 Vol. 14 P. 14369–14392
This study presents on-the-fly identification and multi-step prediction of nonlinear systems with delayed inputs using a dynamic neural network combined with a smooth projection onto ellipsoids. The projection enforces parameter constraints that guarantee stability, while a Lyapunov–Krasovskii analysis yields computable ultimate error bounds. Riccati-type matrix inequalities are derived, providing an efficient vectorization–projection–devectorization implementation suitable for ...
Added: May 22, 2026
Опыт применения сетевого анализа (SNA) в историческом нарративе полисубъектного региона (на примере валлийской хроники Brut y Tywysogyon)
Loshkareva M. E., Matveeva N., Вестник Томского государственного университета. История 2026 № 100 С. 112–118
This research is an endeavor to apply social network analysis (SNA) to the study of a medieval narrative source. The authors suppose that the use of network analysis may offer new possibilities in the study of the history of regions characterized by some political fragmentation. Authors tried to construct networks of historical interactions from 1193 ...
Added: May 22, 2026
Эстетика аудиовизуальной журналистики. Учебное пособие. 2-е издание
Novikova A., Бережная М. А., Кирия И. В., КноРус, 2026.
The aesthetics of journalism is substantiated as a necessary component in the professional training of specialists in audiovisual media. The factors and trends of historical and current changes in the aesthetics of journalism are presented, and the aesthetic practices of audiovisual journalism are characterized in terms of their social functioning. Criteria for aesthetic evaluation are ...
Added: May 22, 2026
Проблемы интеграции культурного наследия в креативные индустрии Республики Тыва
Монгуш В. Р., Novikova A., Креативные индустрии 2026 Т. 2 № 1 С. 23–41
This article analyzes the historical and cultural background, as well as the current situation and development prospects of the creative industries ecosystem in the Republic of Tuva. A comparative analysis of this remote, subsidized region and its neighbors, the Sakha Republic (Yakutia) and Krasnoyarsk Krai, revealed its strengths, vulnerabilities, and strategies of young creative professionals ...
Added: May 21, 2026
Стили жизни российской молодежи в отношении здоровья: гендерные различия
Orekhov A., Zakharov A., Мониторинг общественного мнения: Экономические и социальные перемены 2026 № 2 С. 3–23
This article investigates the health-related lifestyles of Russian youth. Utilizing longitudinal data from the «Trajectories in Education and Profession» study (N = 3398, 2022), a latent class analysis was conducted, identifying three distinct classes of young people (mean age: 26): those adhering to a healthy lifestyle, those prone to unhealthy habits, and those passive about ...
Added: May 21, 2026
Теория партизана. Промежуточное замечание к понятию политического. Изд. 2-е, исправ. и доп.
Шмитт К., М.: Праксис, 2026.
Классическая работа известного немецкого правоведа и политического теоретика Карла Шмитта, посвященная рассмотрению партизана как «фигуры мирового духа», начиная с его зарождения в ходе борьбы испанского народа против наполеоновских войск в 1808—1813 годах и вплоть до судьбы партизана в ходе «всемирной гражданской войны» ХХ века. Перевод с немецкого Ю. Ю. Коринца. Новая редакция перевода Т. А. Дмитриева ...
Added: May 20, 2026
Три России Макса Вебера: к веберовской социологии русского модерна
Kildyushov O., Мир России: Социология, этнология 2026 Т. 35 № 2 С. 6–21
This article examines a heuristic framework for analyzing the significance of Russian themes in Max Weber’s corpus, in connection with the completion of the complete edition of his works as a comprehensive source base. It highlights the ambivalent position of Russian themes in the his legacy: while Russia was never central to his scholarship, the issue ...
Added: May 20, 2026
ML-based Fast Simulation of FARICH Responses
Shipilov F., Barnyakov A., Ivanov A. et al., / Series Physics "arxiv.org". 2026.
A fast simulation of the detector response is a vital task in high-energy physics (HEP). Traditional Monte-Carlo methods form the backbone of modern particle physics simulation software but are computationally expensive. We present a machine-learning-based approach to fast simulation of the Focusing Aerogel Ring Imaging Cherenkov (FARICH) detector response. Given a particle track and momentum, ...
Added: May 19, 2026
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Rabat: Association for Computational Linguistics, 2026.
Added: May 19, 2026
Dataset of solubility values for organic compounds in binary mixtures of solvents at various temperatures
Bezzubov S., Malikov D., Krasnov L. et al., Scientific data 2026 Vol. 13 Article 727
Solubility is a crucial property of organic compounds, impacting their potential applications in synthetic chemistry, materials science and drug design. Moreover, in technological processes mixtures of solvents are often utilized, making the solubility assessment more complicated. Predicting solubility values in mixtures of solvents from a molecular structure can help to address this issue, although a ...
Added: May 19, 2026
Aerokinesis: An IoT-Based Vision-Driven Gesture Control System for Quadcopter Navigation Using Deep Learning and ROS2
Pikalov V., Meshcheryakov V., Kondratev S. et al., Technologies 2026 Vol. 14 No. 1 P. 1–27
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in scenarios where traditional remote controllers are impractical or unavailable. The architecture comprises ...
Added: May 19, 2026
Aerokinesis: An IoT-Based Vision-Driven Gesture Control System for Quadcopter Navigation Using Deep Learning and ROS2
Kondratev S., Yulia Dyrchenkova, Georgiy Nikitin et al., Technologies 2026 Vol. 14 No. 1 Article 69
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in scenarios where traditional remote controllers are impractical or unavailable. The architecture comprises ...
Added: May 19, 2026
Стили жизни российской молодежи в отношении здоровья: гендерные различия
Zakharov A., Мониторинг общественного мнения: Экономические и социальные перемены 2026 № 2 С. 3–23
This article investigates the health-related lifestyles of Russian youth. Utilizing longitudinal data from the «Trajectories in Education and Profession» study (N = 3398, 2022), a latent class analysis was conducted, identifying three distinct classes of young people (mean age: 26): those adhering to a healthy lifestyle, those prone to unhealthy habits, and those passive about ...
Added: May 19, 2026
Parallel Computational Technologies. PCT 2025
Springer, 2025.
This book constitutes the refereed proceedings of the 19th International Conference on Parallel Computational Technologies, PCT 2025, held in Moscow, Russia, during April 8–10, 2025. The 31 full papers included in this volume were carefully reviewed and selected from 122 submissions. These papers were organized under the following topical sections: High Performance Architectures, Tools and Technologies; ...
Added: May 18, 2026
KMHCR: A Key-Controlled Signal-Domain Transformation for 5G IoT Security
Ronglin Z., Wei L., Jiahong C. et al., Journal of Signal Processing Systems 2026 Vol. 98 P. 1–15
To address the need for lightweight and low-latency protection in massive resource-constrained 5G Internet of Things (IoT) systems, this paper proposes Key-Controlled Modulation Hopping and Constellation Rotation (KMHCR). KMHCR is designed as a physical-layer confidentiality-enhancement mechanism that avoids bit-wise full-payload encryption in the protection pipeline. It uses a shared key derived from channel-reciprocity secret key ...
Added: May 16, 2026
DPN Verifier: A Toolkit for Faster Soundness Verification and Repair of Process Models with Data
Suvorov N. M., Proceedings of the Institute for System Programming of the RAS 2026 Vol. 38 No. 3(2) P. 49–66
Data Petri Nets (DPNs) extend classical Petri nets to model processes where data directly influences control-flow, enabling a comprehensive view of system behavior and possibility to detect failure points that could otherwise be hidden. Soundness is a correctness criterion that captures such failure points as deadlocks and livelocks as well as model boundedness and absence ...
Added: May 16, 2026
Подход к автоматическому распознаванию эмоций в транскрипциях речи
Dvoynikova A., Кондратенко К. О., Известия высших учебных заведений. Приборостроение 2023 Т. 66 № 10 С. 818–827
Аннотация. Исследован актуальный в различных областях вопрос распознавания эмоций в транскрипциях речи. Проанализировано влияние методов предобработки (удаление стоп-слов, лемматизация, стемминг) на точность распознавания эмоций в текстовых данных на русском и английском языках. Для проведения экспериментальных исследований использовались орфографические транскрипции диалогов из многомодальных корпусов RAMAS и CMU-MOSEI на русском и английском языке соответственно. Аннотирование этих корпусов ...
Added: April 25, 2026
Эко-реальность и эко-образ российских регионов в пабликах социальной сети «В Контакте»
Nemirovskaya A., Муничкина О. П., Вестник Института социологии 2026 Т. 17 № 1 С. 183–208
This paper examines media representation of environmental problems in six Russian regions through the lens of regional public pages (with official and unofficial status) on the VKontakte social network, which function as online media. Based on a content analysis of news public pages on VKontakte from six Russian regions, including both environmentally favorable and unfavorable ...
Added: April 1, 2026
Эмодукты счастья: коммодификация и маркетинговые стратегии в популярной психологии
Matkin N., Novikova A., Экономическая социология 2026 Т. 27 № 1 С. 92–124
In the context of the growing demand for psychological services in Russia and the spread of therapeutic culture, digital platforms like YouTube are becoming a key locus for the commercialization of emotions. However, the mechanisms of commodification, particularly concerning happiness, remain underexplored in this digital environment. This article examines how popular Russian psychological bloggers on ...
Added: February 2, 2026
Optimizing Modality Weights in Topic Models of Transactional Data
Khrylchenko K., Vorontsov K. V., Automation and Remote Control 2022 Vol. 83 No. 12 P. 1908–1922
Added: November 19, 2025
Interaction of Functional Brain Networks Is Associated With k-Clique Percolation in the Human Structural Connectome
Dogonasheva O., Zakharov D., Tiselko V. et al., Human Brain Mapping 2025 Vol. 46 No. 15 Article e70343
The human structural connectome has a complex internal community organization, characterized by a high degree of overlap and related to functional and cognitive phenomena. We explored connectivity properties in connectome networks and showed that 𝑘‐clique percolation of an anomalously high order is characteristic of the human structural connectome. The resulting structural organization maintains a high local ...
Added: November 11, 2025
Анализ тематики повседневных разговоров: экспертный подход и автоматические методы
Sherstinova T., Вепринцева Д. А., Человек: образ и сущность. Гуманитарные аспекты 2025 № 2(62) С. 89–108
В статье рассматриваются три разных подхода к изучению тематики повседневных разговоров: экспертная тематическая разметка и два автоматических метода (тематическое моделирование и кластеризация). Материалом для исследования послужили расшифровки русской устной повседневной речи из корпуса ОРД, подготовленные на основе звукозаписей спонтанных разговоров, выполненных в естественных коммуникативных ситуациях (дома, на работе, в учебном заведении, в магазине, в поликлинике ...
Added: September 3, 2025
Institutional Determinants and Emerging Trends in Foreign Market Entry Strategies by Small and Medium Enterprises: A Systematic Literature Review
Sikachev A., Veselova A., Управленец 2026 Vol. 17 No. 1 P. 65–83
As small and medium-sized enterprises (SMEs) strive for expansion beyond their domestic borders, the appeal of international markets is undoubtedly attractive. However, there are often numerous obstacles to this journey, which can be complex for companies without experience in international expansion. This article aims to fill the existing gap in the literature by thoroughly analyzing ...
Added: August 21, 2025
Модификация языковой модели SBERT для выявления ESG-рисков на основе текстовых данных компаний и контрольно-надзорных мероприятий
Buzmakov A. V., Kirpishchikov D., Naidenova I. N. et al., Вестник Санкт-Петербургского университета. Серия 10. Прикладная математика. Информатика. Процессы управления 2025 Т. 21 № 1 С. 75–91
An approach has been developed to identify risks associatedwith companies’ environmentalimpact, social responsibility, and governance quality (Environmental, Social, and Governance - ESG risks) based on textual information about the company. To achieve this, a modification of the SBERT language model is proposed with a clearly defined distance functionfor the embedding space. The model is trained on ...
Added: June 6, 2025
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit