• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Prediction of News Popularity via Keywords Extraction and Trends Tracking
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 22, 2026
HSE Graduates AI Project Wins at TECH & AI Awards
Daria Davydova, graduate of the HSE Graduate School of Business and Head of the AI Implementation Unit at the Artificial Intelligence Department of Alfa-Bank, received a prize at the TECH & AI Awards. She was awarded for the best AI solution for optimising business processes. The winners were determined as part of the VII Russian Summit and Awards on Digital Transformation (CDO/CDTO Summit & Awards).
May 20, 2026
HSE University Opens First Representative Office of Satellite Laboratory in Brazil
HSE University-St Petersburg opened a representative office of the Satellite Laboratory on Social Entrepreneurship at the University of Campinas in Brazil. The platform is going to unite research and educational projects in the spheres of sustainable development, communications and social innovations.
May 18, 2026
The 'Second Shift' Is Not Why Women Avoid News
Women are more likely than men to avoid political and economic news, but the reasons for this behaviour are linked less to structural inequality or family-related stress than to personal attitudes and the emotional perception of news content. This conclusion was reached by HSE researchers after analysing data from a large-scale survey of more than 10,000 residents across 61 regions of Russia. The study findings have been published in Woman in Russian Society.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Prediction of News Popularity via Keywords Extraction and Trends Tracking

Ch. 4. P. 37–51.
Alexander Pugachev, Voronov A., Makarov I.

In the last years, news agencies have become more influential in various social groups. At the same time, the media industry starts to monetize online distributed articles with contextual advertising. However, the efficiency of online marketing highly depends on the popularity of news articles. In our work, we present an alternative and effective way for article popularity forecasting with two–step approach: article keywords extraction and keywords-based article popularity prediction. We show the benefits of this technique and compare with widely used methods, such as Text Embeddings and BERT–based methods. Moreover, the work provides an architecture of the model for dynamic keyword tracking trained on the newest dataset of Russian news articles with more than 280k articles and 22k keywords for the popularity of forecasting purposes.

Language: English
Full text
DOI
Text on another site
Keywords: обработка текстов на естественном языкеkeyword extractionOnline news popularity forecastingPopularity predictionBERTText embeddingпредсказание популярности новостей

In book

Recent Trends in Analysis of Images, Social Networks and Texts. 9th International Conference, AIST 2020, Skolkovo, Moscow, Russia, October 15–16, 2020 Revised Supplementary Proceedings
Vol. 12602. , Springer, 2021.
Similar publications
Development of a Language Model for Automated Classification of English-Language Scientific Articles by SRSTI Codes
V. V. Zunin, A. I. Afonin, V. I. Anoshin et al., Automatic Documentation and Mathematical Linguistics 2025 Vol. 59 No. 5 P. 287–293
The development of an artificial intelligence-based language model for classifying English-language scientific articles by SRSTI codes is described. This improves the processes of reviewing and indexing scientific publications. A pre-processed dataset of scientific articles was used for training and testing the models. An architecture for cascade classification was developed, and the performance of models with ...
Added: February 11, 2026
The Impact of Alternative Data on Default Probability: Analyzing the Italian E-commerce Sector with NLP and Network Structures
Bernhardt B. D., Marciano C., Guarracino M. R., Operations Research Forum 2025 Vol. 6 Article 47
E-commerce is a key sector in the Italian economy, with online companies becoming some of the largest and most profitable businesses. However, this growth comes with increased risk exposure. This study aims to investigate the relationship between alternative data (contextual factors, Text-Driven Data Enrichment) and the probability of default for Italian e-commerce companies. To date, ...
Added: September 6, 2025
Shrink the Longest: Improving Latent Space Isotropy with Simplicial Geometry
Kudrjashov S., Karpik O., Klyshinskiy E., , in: Analysis of Images, Social Networks and Texts, 12th International Conference, AIST 2024, Bishkek, Kyrgyzstan, October 17–19, 2024, Revised Selected PapersVol. 15419.: Springer, 2024. P. 120–130.
Added: May 29, 2025
Индекс этичности российских банков на основе искусственного интеллекта
Storchevoy M., Parshakov P., Paklina S. et al., Доклады Российской академии наук. Математика, информатика, процессы управления (ранее - Доклады Академии Наук. Математика) 2024 Т. 520 № 6 С. 70–81
Measuring a company's ethics is an important element in the mechanism of regulating the behavior of market participants, as it allows consumers and regulators to make better decisions, which has a disciplining effect on companies. We tested various methods of machine analysis of consumer feedback from Russian banks and developed an Ethics Index that allows ...
Added: October 31, 2024
The More Polypersonal the Better - A Short Look on Space Geometry of Fine-Tuned Layers
Sergei Kudriashov, Veronika Zykova, Stepanova A. et al., , in: Advances in Neural Computation, Machine Learning, and Cognitive Research VIII, Selected Papers from the XXVI International Conference on Neuroinformatics, October 21-25, 2024, Moscow, RussiaVol. VIII.: Cham: Springer, 2024. P. 13–22.
The interpretation of deep learning models is a rapidly growing field, with particular interest in language models. There are various approaches to this task, including training simpler models to replicate neural network predictions and analyzing the latent space of the model. The latter method allows us to not only identify patterns in the model’s decision-making process, but also understand ...
Added: October 24, 2024
Regional inflation analysis using social network data
Shcherbakov, V., Karpov I., Economy of Regions 2024 Vol. 20 No. 3 P. 930–946
Inflation is one of the most important macroeconomic indicators that have a great impact on the population of any country and region. Inflation is influenced by range of factors, one of which is inflation expectations. Many central banks all over the World take this factor into consideration while implementing monetary policy within the inflation targeting ...
Added: December 7, 2023
Grammar in Language Models: BERT Study
Chistyakova K., Kazakova Tatiana, / NRU HSE. Series WP BRP "Linguistics". 2023. No. 115.
The problem of language models’ interpretation is extensively inspected, but no universal answers have been found. Our study offers to combine widely accepted probing methods with a novel approach to a neural network under investigation. We propose to break grammatical forms on the pre-training step in order to get two "sibling" models, as it casts ...
Added: November 29, 2023
Classification of Short Scientific Texts
I. K. Kusakin, Fedorets O. V., A. Y. Romanov, Scientific and Technical Information Processing 2023 Vol. 50 No. 3 P. 176–183
This paper discusses modern approaches to natural language processing and the application of machine learning models to the task of classifying short scientific texts in Russian. This study is devoted to the analysis of methods for vectorization of textual information, selection of a model for scientific paper clas- sification, and training of linguistic model BERT ...
Added: November 4, 2023
Identifying and Visualizing Trends in Science, Technology, and Innovation Using SciBERT
Lobanova P., Bakhtin P., Sergienko Y., IEEE Transactions on Engineering Management 2024 No. 71 P. 11898–11906
Identification of science, technology, and innovation trends is a critical topic both for the scientific community and for companies that develop technologies, work on science and technology policy or invest in high tech. In this research authors demonstrate a novel approach implemented in iFORA system (developed by National Research University Higher School of Economics) using ...
Added: September 8, 2023
How to detect propaganda from social media? Exploitation of semantic and fine-tuned language models
Malik M. S., Imran T., Mona Mamdouh J., PeerJ Computer Science 2023 Vol. 9 Article e1248
Online propaganda is a mechanism to influence the opinions of social media users. It is a growing menace to public health, democratic institutions, and public society. The present study proposes a propaganda detection framework as a binary classification model based on a news repository. Several feature models are explored to develop a robust model such ...
Added: September 4, 2023
Automated defect identification for cell phones using language context, linguistic and smoke-word models
Muhammad Z. Y., Malik M. S., Ignatov D. I., Expert Systems with Applications 2023 Vol. 227 Article 120236
Product defects are a widespread concern for manufacturers when conducting quality and customer relationship management. Prior approaches addressed many electronic products however cell phones are still unexplored. Moreover, prior work mainly focused on the lexicon, probabilistic graphic, failure mode, and effect analysis models but the utilization of word embeddings and language models are not explored. State-of-the-art contextual word embeddings and language models generate automated features and ...
Added: June 13, 2023
Computational Experiments on Detecting Meaning shift in Jokes
Eugeniia Zakovorotnaia, , in: 2022 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON).: Ekaterinburg: IEEE, 2022. P. 840–843.
The paper describes an experimental approach to detect the meaning shift, one of the most fundamental characteristics of humor, which is studied by many scientists in different interdisciplinary theoretical methodologies. We measured cosine similarity between setups and punchlines and explained these results through the set of objective criteria such as cosine results limitations, punchline length, ...
Added: May 10, 2023
Использование BERT для классификации коротких научных текстов на русском языке
Кусакин И. К., Цурупа А. М., Алмакаев А. В. et al., В кн.: НТИ-2022. Научная информация в современном мире: глобальные вызовы и национальные приоритеты : материалы 10-ой научной конференции с международным участием, посвященной 70-летию ВИНИТИ РАН, Москва, 25–26 октября 2022 года.: М.: ВИНИТИ РАН, 2022. С. 103–109.
This work is devoted to the study of approaches for training BERT-based classifiers of scientific articles to implement the application with the adoption of the best models for use in the infrastructure of the VINITI RAS. For this purpose, the BERT linguistic model was trained on a specialized corpus of scientific texts for subsequent use ...
Added: January 31, 2023
Исследование методов машинного обучения для классификации научных текстов на русском языке
Кусакин И. К., Федорец О. В., Romanov A., Научно-техническая информация. Серия 2: Информационные процессы и системы 2022 Т. 12 С. 6–9
This paper discusses modern approaches to natural language processing and appliance of artificial intelligence technologies in the task of classifying scientific texts in Russian. The report contains an analysis of implementations of text vectorization methods, a description of experiments with training various classifier models: from classical machine learning algorithms to neural network transformer architectures. ...
Added: January 31, 2023
Opinion Mining for Modeling User Experience of Online Education: Sentiment Analysis and Keywords Extraction of Student Reviews
Moskvina A., Kirina M., Anastasia Gavrilyuk, , in: 2022 32nd Conference of Open Innovations Association (FRUCT).: IEEE, 2022. P. 187–195.
The paper discusses the possibilities of applying modern natural language processing technologies of opinion mining to investigate and improve the user experience of online-courses students. We analyzed 27 000 student reviews of projects within the Python programming language course. First, we applied keyword extraction algorithms as a way of semantic compression to receive a generalized ...
Added: December 9, 2022
SocialBERT – Transformers for Online Social Network Language Modelling
Ilia Karpov, Nick Kartashev, , in: Analysis of Images, Social Networks and Texts. 10th International Conference, AIST 2021, Tbilisi, Georgia, December 16–18, 2021, Revised Selected Papers.: Cham: Springer, 2022. P. 1–10.
The ubiquity of the contemporary language understanding tasks gives relevance to the development of generalized, yet highly efficient models that utilize all knowledge, provided by the data source. In this work, we present SocialBERT - the first model that uses knowledge about the author’s position in the network during text analysis. We investigate possible models ...
Added: October 31, 2021
BERT for Sequence-to-Sequence Multi-label Text Classification.
Yarullin R., Serdyukov P., , in: Analysis of Images, Social Networks and Texts: 9th International Conference, AIST 2020, Skolkovo, Moscow, Russia, October 15–16, 2020, Revised Selected PapersVol. 12602.: Springer, 2021. P. 187–198.
Added: October 4, 2021
Named Entity Recognition from Chernobyl Documentaries
Daniil Tikhomirov, Nikitinsky N., Makarov I., , in: Proceedings of the Conference on Modeling and Analysis of Complex Systems and Processes 2020 (MACSPro 2020)Vol. 2795.: CEUR Workshop Proceedings, 2020. P. 133–139.
The paper describes a system that extracts facts and opinions from documentary texts to create a domain ontology of a controversial topic for Chernobyl disaster. The pipeline of the system is based on RNNbased NER module, which was tested on an annotated text corpora. ...
Added: January 4, 2021
Comparative Study Of Data Clustering Algorithms And Analysis Of The Keywords Extraction Efficiency: Learner Corpus Case
Scherbakova A., / NRU HSE. Series WP BRP "Linguistics". 2020.
Added: December 2, 2020
Keyphrase extraction from the Russian corpus on linguistics by means of KEA and RAKE algorithms
Moskvina Anna, Sokolova E., Mitrofanova O., , in: Data Analytics and Management in Data Intensive Domains. Proceedings of the XX International Conference – DAMDID/RCDL’2018, October 9-12, 2018, Moscow.: M.: FRC CSC RAS, 2018. P. 369–372.
This paper is devoted to comparison of two state-of-the-art keyphrase extraction algorithms, namely KEA based on machine learning and RAKE working with morphosyntactic patterns. Comparative study deal with peculiarities of KEA and RAKE with regard to particular research tasks. Experiments carried out on the Russian corpus on Linguistics allow to work out the best options ...
Added: September 29, 2020
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit