• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Articles
  • How to detect propaganda from social media? Exploitation of semantic and fine-tuned language models
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 22, 2026
‘In Science, You Are Your Own Boss
Polina Nasledskova is interested in identifying gaps in linguistics and topics that have been overlooked by other researchers. In an interview for the  Young Scientists of HSE University project, she spoke about rare ordinal numerals in Nakh-Daghestanian languages, the benefits of knitting for concentration, and the beauty of the Patriarshy Bridge.
June 19, 2026
HSE Researchers Determine Which Internet Users Are More Likely to Fact-Check
Researchers at HSE University examined the strategies employed by Russian internet users to verify unreliable information and the factors that motivate them to do so. The study found that more than half of users who encounter potentially false information online attempt to verify it by locating the original source. The likelihood of fact-checking is influenced by several factors, including age, place of residence, social status, information literacy skills, and the use of AI. The findings have been published in Monitoring of Public Opinion: Economic and Social Changes.
June 5, 2026
'Im Used to Producing Distilled Knowledge'
Ivan Rubachev works in a HSE University laboratory established jointly with Yandex Research, where he focuses on machine learning with tabular data. In this interview with the HSE Young Scientists project, he discusses why following a vibe can be better than goal-setting, explains the concept of the Neural Turing Machine, and argues why withholding scientific knowledge is counterproductive.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

How to detect propaganda from social media? Exploitation of semantic and fine-tuned language models

PeerJ Computer Science. 2023. Vol. 9. Article e1248.
Malik M. S., Imran T., Mona Mamdouh J.

 

Online propaganda is a mechanism to influence the opinions of social media users. It is a growing menace to public health, democratic institutions, and public society. The present study proposes a propaganda detection framework as a binary classification model based on a news repository. Several feature models are explored to develop a robust model such as part-of-speech, LIWC, word uni-gram, Embeddings from Language Models (ELMo), FastText, word2vec, latent semantic analysis (LSA), and char tri-gram feature models. Moreover, fine-tuning of the BERT is also performed. Three oversampling methods are investigated to handle the imbalance status of the Qprop dataset. SMOTE Edited Nearest Neighbors (ENN) presented the best results. The fine-tuning of BERT revealed that the BERT-320 sequence length is the best model. As a standalone model, the char tri-gram presented superior performance as compared to other features. The robust performance is observed against the combination of char tri-gram + BERT and char tri-gram + word2vec and they outperformed the two state-of-the-art baselines. In contrast to prior approaches, the addition of feature selection further improves the performance and achieved more than 97.60% recall, f1-score, and AUC on the dev and test part of the dataset. The findings of the present study can be used to organize news articles for various public news websites.

Research target: Computer Science
Language: English
Full text
DOI
Text on another site
Keywords: propagandasemantic analisysbinary modellinguisticword2vecnews mediaBERT
Publication based on the results of:
Models and method for analysis of unstructured data, data mining and recommender systems (2023)
Similar publications
The state and prospects of using virtual reality technologies in sports: a brief review
Atlasov B., Selskiy A., Russian Journal of Information Technology in Sports 2025 Vol. 2 No. 1 P. 13–21
The article examines the current state of the global virtual and augmented reality (VR/AR) technology market in sports, noting its growth, although slower than previously expected. Special attention is paid to the Russian market, where the development of VR technologies in sports lags behind world leaders such as the United States, EU countries and China, ...
Added: June 23, 2026
2025 9th International Conference on Information, Control, and Communication Technologies (ICCT-2025)
IEEE, 2026.
The 9th International Scientific Conference on Information, Control, and Communication Technologies (ICCT-2025) had been held October 7-11, 2025 in Gomel, Belarus. The main technical areas and applications covered by the proceedings are optoelectronics, acousto-optic, microwave technology, antenna systems, measuring technology, metamaterials, nanostructures, nanofilms, photonic crystals, biology and medicine, biophotonics, bioengineering, neural networks in communication technologies; ...
Added: June 23, 2026
Proceedings of the 4th Workshop on NLP for Music and Audio (NLP4MusA 2026)
Buzaev F., Mullakhmetov R., Bogachev R. et al., Association for Computational Linguistics, 2026.
Playlist generation based on textual queries using large language models (LLMs) is becoming an important interaction paradigm for music streaming platforms. User queries span a wide spectrum from highly personalized intent to essentially catalog-style requests. Existing systems typically rely on non-personalized retrieval/ranking or apply a fixed level of preference conditioning to every query, which can ...
Added: June 22, 2026
Нет похвалы без обличения: особенности положительной репрезентации в советской внешнеполитической карикатуристике 1960–1970-х годов
Medakin S., Диалог со временем 2026 № 95 С. 409–422
The article analyses the phenomenon of positive representation in Soviet caricatures devoted to international relations. Both the peculiarities of the artistic and political phenomenon itself, atypical for the history of political caricatures, and its close connection with the foreign policy agenda, ideology, and the legacy of Soviet propaganda of the 1920s-1950s are studied. Visual sources ...
Added: June 22, 2026
Zα and Zβ Localize ADAR1 to Flipons That Modulate Innate Immunity, Alternative Splicing, and Nonsynonymous RNA Editing
Herbert A., Cherednichenko O., Lybrand T. et al., International Journal of Molecular Sciences 2025 Vol. 26 No. 6 Article 2422
The double-stranded RNA editing enzyme ADAR1 connects two forms of genetic programming, one based on codons and the other on flipons. ADAR1 recodes codons in pre-mRNA by deaminating adenosine to form inosine, which is translated as guanosine. ADAR1 also plays essential roles in the immune defense against viruses and cancers by recognizing left-handed Z-DNA and ...
Added: June 22, 2026
Международная конференция «Математические идеи академика П.Л. Чебышёва, их приложения в естественных науках и технологи- ях искусственного интеллекта», приуроченная к 205-й годовщине со дня его рождения» : Материалы конференции. / (Обнинск, 14–16 мая 2026 г.): Материалы конференции. Под ред. акад. В.Б. Бетелина. — Калуга: Калужский печатный двор, 2026. — 232 с.
Калужский печатный двор, 2026.
Conference Proceedings INTERNATIONAL CONFERENCE “Mathematical Ideas of Academician P.L. Chebyshev, Their Applications in Natural Sciences and Artificial Intelligence Technologies” dedicated to the 205th anniversary of his birth ...
Added: June 20, 2026
ИНТЕГРАЦИЯ ТЕХНОЛОГИИ ГЕНЕРАТИВНОГО ИСКУССТВЕННОГО ИНТЕЛЛЕКТА В ОБРАЗОВАТЕЛЬНЫЙ ВИДЕОКОНТЕНТ
Stognieva O., Чеснокова Н. Е., Отечественная и зарубежная педагогика 2026 Т. 1 № 3 (115) С. 123–131
Integration of generative artificial intelligence tools into educational practice highlights the need for pedagogically grounded approaches to their use in the creation of educational video content, which is increasingly applied in language and professionally oriented instruction. The purpose of this article is to conduct a comparative analysis of educational video content created using generative AI tools ...
Added: June 20, 2026
Benchmarking DNA large language models on quadruplexes
Cherednichenko O., Herbert A., Poptsova M., Computational and Structural Biotechnology Journal 2025 Vol. 27 P. 992–1000
Large language models (LLMs) in genomics have successfully predicted various functional genomic elements. While their performance is typically evaluated using genomic benchmark datasets, it remains unclear which LLM is best suited for specific downstream tasks, particularly for generating whole-genome annotations. Current LLMs in genomics fall into three main categories: transformer-based models, long convolution-based models, and state-space models ...
Added: June 19, 2026
Kolmogorov–Arnold networks for genomic tasks
Poptsova M., Briefings in Bioinformatics 2025 Vol. 26 No. 2 P. 1–11
Kolmogorov–Arnold networks (KANs) emerged as a promising alternative for multilayer perceptrons (MLPs) in dense fully connected networks. Multiple attempts have been made to integrate KANs into various deep learning architectures in the domains of computer vision and natural language processing. Integrating KANs into deep learning models for genomic tasks has not been explored. Here, we ...
Added: June 19, 2026
Графовые паттерны в несогласованных декларативных моделях процессов
Анненков А. Н., Nesterov R., Моделирование и анализ информационных систем 2026 Т. 33 № 2 С. 176–205
Declarative process models are widely used in process mining to describe flexible process behavior through sets of constraints. However, models discovered automatically from event logs may contain inconsistent constraints, which can make them difficult to interpret and unusable for execution, conformance checking, or further analysis. Existing methods for consistency analysis either rely on automata-based constructions ...
Added: June 18, 2026
Advances in Information Retrieval: 48th European Conference on Information Retrieval, ECIR 2026, Delft, The Netherlands, March 29 – April 2, 2026, Proceedings, Part II. (LNCS, volume 16484)
Cham: Springer Publishing Company, 2026.
The four-volume set LNCS 16483-16486 constitutes the refereed conference proceedings of the 48th European Conference on Information Retrieval, ECIR 2026, held in Delft, The Netherlands, during March 29–April 2, 2026. The 46 full papers and 37 short papers presented together with 10 findings papers, 9 reproducibility papers, 17 resource papers, 11 workshop papers, 7 tutorial papers, ...
Added: June 18, 2026
Искусственный интеллект как роза научной деятельности: исследование Тимоти Гауэрса
Poddiakov A., Троицкий вариант. Наука 2026 № 12 С. 24–25
В научно-популярной заметке представлен обзор содержания поста филдсовского медалиста Тимоти Гауэрса о возможностях ИИ в математике и содержания комментариев под постом. Обзор сделан в основном чат-ботом DeepSeek. В заключение обсуждается возможность не только решения задач искусственным интеллектом, но и их постановки. ...
Added: June 18, 2026
Exploring New Frontiers in Vertical Federated Learning: the Role of Saddle Point Reformulation
Beznosikov A., Kormakov G., Grigorievskiy A. et al., Journal of Optimization Theory and Applications 2026 Vol. 209 Article 18
The objective of Vertical Federated Learning (VFL) is to collectively train a model using features available on different devices while sharing the same users. This paper focuses on the saddle point reformulation of the VFL problem via the classical Lagrangian function. We first demonstrate how this formulation can be solved using deterministic methods.More importantly, we explore various stochastic modifications to ...
Added: June 17, 2026
Supervised Learning in Critical Phenomena—Statistical and Systematic Accuracy
Chertenkov V. I., Shchur L., Lobachevskii Journal of Mathematics 2026 Vol. 47 No. 2 P. 720–727
Supervised machine learning is successfully applied to the study of critical phenomena and allows us to obtain a numerical estimate of the phase transition temperature and the correlation length exponent. We discuss the influence of possible systematic errors, as well as statistical errors, on the accuracy of such numerical estimates. Errors in the training and ...
Added: June 16, 2026
Enhancing Emotion Recognition in Speech Based on Self-Supervised Learning: Cross-Attention Fusion of Acoustic and Semantic Features
Deeb B., Andrey V. Savchenko, Makarov I., IEEE Access 2026 Vol. 13 P. 56283–56295
Speech Emotion Recognition has gained considerable attention in speech processing and machine learning due to its potential applications in human-computer interaction, mental health monitoring, and customer service. However, state-of-the-art models for speech emotion recognition use many parameters, which leads to computational complexity. In this paper, we introduce a novel deep-learning model to enhance the accuracy ...
Added: June 16, 2026
Automated detection of wolf howls using audio spectrogram transformers
Makarov N., Savchenko A., Zemtsova I. et al., Scientific Reports 2025 Vol. 15 Article 26641
The grey wolf (Canis lupus) is a pivotal species for ecological studies. As a key participant in ecosystem processes, it also serves as a model for investigating social structure formation and ecological adaptation. However, the species’ complex social behavior, spatial dynamics, and expansive habitats make monitoring and population assessments across large areas particularly challenging. In recent years, audio traps ...
Added: June 16, 2026
Artificial intelligence framework for multi-pathology risk assessment from retinal fundus images: deep learning approach to 15-disease screening
Vasilev R., Savchenko A., Blinov P. et al., Frontiers in Medicine 2026 Vol. 13
Automated disease screening systems face challenges when applied to multi-class medical image analysis, particularly under severe class imbalance inherent in clinical datasets. Retinal fundus imaging enables non-invasive screening for multiple ocular and systemic diseases simultaneously, yet current automated systems typically assess risk for only a single pathology or a limited disease range. We developed a ...
Added: June 16, 2026
From Data to Signs: A Foundation Model for Multilingual Sign Language Recognition
Novopoltsev M., Tulenkov A., Murtazin R. et al., IEEE Access 2025 Vol. 13 P. 188170–188181
Video-based Isolated Sign Language Recognition (ISLR) problem presents significant challenges in scaling across diverse languages due to data scarcity and the computational costs associated with training of language-specific models. In this paper, we introduce a novel training pipeline that leverages self-supervised learning on a large-scale sign language dataset. To obtain the foundation model, we utilize ...
Added: June 16, 2026
B3Emo: Quantifying Affect as a Double-Edged Sword in Strategic LLM Interactions
Stepin A., Mozikov M., Kabanov A. et al., IEEE Access 2026 Vol. 14 P. 48127–48144
The deployment of large language models (LLMs) in interactive roles such as automated negotiators, customer service agents, and strategic partners requires them to handle not only logical tasks but also the socio-emotional dimensions of interaction. In these situations, success often relies on understanding social cues, building trust, and using persuasion effectively. These skills are closely ...
Added: June 16, 2026
ESQA: Event Sequences Question Answering
Abdullaeva I., Karpukhin I., Filatov A. et al., IEEE Access 2026 Vol. 14 P. 59390–59408
Event sequences, a specialized type of tabular data annotated with timestamps, are prevalent across practical domains such as finance, retail, social networks, and healthcare. Despite the importance of event sequence modeling and analysis, there has been little effort to adapt Large Language Models (LLMs) to this domain. In this paper, we propose a novel solution ...
Added: June 16, 2026
«История повторяется»: образы XIX века в советской агитации 1941–1945 годов (по материалам журнала «Крокодил»)
Medakin S., Новое прошлое 2026 № 1 С. 129–147
The article deals with the peculiarities of the representation of history and culture of the XIX century on the pages of the magazine “Krokodil” in 1941–1945. It was the 19th century, excluding the events of the 20th century, that received the most coverage in “Krokodil” during this period — a little less than a third ...
Added: May 1, 2026
Немцы и османы в Афганистане в годы Первой мировой войны: османский джихад, пропаганда и слухи
Шерстюков С. А., Новая и новейшая история 2026 Т. 70 № 1 С. 79–92
Following the outbreak of the First World War, Germany and the Ottoman Empire sought to draw Afghanistan into the conflict on the side of the Central Powers by dispatching diplomatic and military missions to Kabul. Although these efforts failed, the Ottoman proclamation of jihad and the activities of German and Ottoman agents in Afghanistan produced ...
Added: March 6, 2026
Development of a Language Model for Automated Classification of English-Language Scientific Articles by SRSTI Codes
V. V. Zunin, A. I. Afonin, V. I. Anoshin et al., Automatic Documentation and Mathematical Linguistics 2025 Vol. 59 No. 5 P. 287–293
The development of an artificial intelligence-based language model for classifying English-language scientific articles by SRSTI codes is described. This improves the processes of reviewing and indexing scientific publications. A pre-processed dataset of scientific articles was used for training and testing the models. An architecture for cascade classification was developed, and the performance of models with ...
Added: February 11, 2026
Женщины нацистской Германии в зеркале советской карикатуры (на материале журнала «Крокодил»)
Рябов О. В., Уральский исторический вестник 2019 № 3 С. 84–92
The article deals with analysis of the images of the Nazi Germany women created by the Soviet propaganda during the Great Patriotic War by means of satirical graphics. The base of the research is the caricatures published in the “Crocodile” magazine in June 1941 - May 1945. The author demonstrates that the comic images of ...
Added: January 28, 2026
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit