• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 5, 2026
Neural Network Maps as a Method for Constructing Mathematical Models
Scientists from HSE University–Nizhny Novgorod and the Institute of Physics Belgrade, Serbia, are jointly exploring the application of machine learning techniques and neural networks to the study of nonlinear dynamics. Natalya Stankevich, Leading Research Fellow at the Laboratory of Topological Methods in Dynamics of the Faculty of Informatics, Mathematics, and Computer Science at HSE University–Nizhny Novgorod, spoke to the HSE News Service about this international project.
June 5, 2026
‘In the Age of Technology, It Is Interesting to Look into the Past and Think about What We Can Take from It
Polina Tabakova decided to apply for a Philology degree at HSE in Nizhny Novgorod because she grew up in Mari El and did not want to move far away from the Russian forests. In an interview for the Young Scientists of HSE University project, she spoke about the genre of the campus novel, the existential drama of Kolobok, and a blackout version of Eugene Onegin.
June 5, 2026
HSE Scientists Develop Method to Compress Large Language Models Without Losing Quality
Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science have developed a new compression method for large language models such as GPT and LLaMA that reduces their size by 25–36% without additional training or significant loss of accuracy. This is the first approach to use mathematical transformations—specifically, rotations of model weights—to make models more amenable to compression with structured matrices. The study results have been published in ACL Findings 2025. The code is available on GitHub.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Sociolinguistic Variability of Russian Everyday Speech: A Corpus-Based Study

P. 288–293.
Sherstinova T., Богданова-Бегларян Н. В., Баева Е., Горбунова Д., Попова Т. И., Blinova O. V.

The paper presents recent results of a multilevel analysis of representative corpus data, conducted in order to identify key speech parameters (lexical, morphological and syntactic) that can diagnose some social/biological characteristics of a speaker or, more broadly, a modern Russian urban sociolect. The study is based on the everyday Russian speech corpus “One Speaker’s Day”. Specific data were obtained on the analysis of the annotated subcorpus of 289,205 tokens, which includes recorded “speech days” of 57 men and 48 women, which were the research participants, as well as speech fragments of 87 men and 139 women, which were their interlocutors. Thus, the total number of speakers in the subsample amounts to 144 men and 187 women. The article also begs the question of Data Mining approach usability to the subcorpus and possibilities of further research using machine learning. The results obtained are important for the optimization of speech technologies systems, for theoretical understanding of linguistic processes, as well as for monitoring various social processes taking place in modern Russian metropolis.

Language: English
Keywords: pragmaticsRussian language

In book

Proceedings of the 27th Conference of Open Innovations Association FRUCT
IEEE, 2020.
Similar publications
Juxtapositional vs. possessive-like encoding in Russian specificational constructions
Logvinova N., Russian linguistics 2026 Vol. 50 Article 11
This paper presents the first in-depth corpus-based study of a previously overlooked syntactic variation in Russian: the competition between juxtapositional (Nominative) and possessive-like (Genitive) encoding of the second noun (the term) in specificational constructions (e.g., ponjatie čest’ (notion.NOM honor.NOM) vs. ponjatie česti (notion.NOMhonor.GEN) ‘the notion of honor’). While typological research has established cross-linguistic preferences for one encoding strategy over another, intralinguistic variation ...
Added: May 18, 2026
Дискриминативная лемматизация сокращений в эпоху LLM
Глазкова А. В., Смаль И. В., Lyashevskaya O. et al., Доклады Российской академии наук. Математика, информатика, процессы управления (ранее - Доклады Академии Наук. Математика) 2025 Т. 527 С. 146–155
This paper presents a study on the effectiveness of discriminative methods for abbreviation lemmatization in Russian texts. Unlike generative approaches, discriminative models select the optimal lemma from a fixed set of candidates, eliminating the risk of generating grammatically incorrect word forms. For the first time in Russian language processing, we conduct a comprehensive analysis of ...
Added: March 10, 2026
Rubic2: Ensemble Model for Russian Lemmatization
Afanasev I., Glazkova A., Lyashevskaya O. et al., , in: Proceedings of the 10th Workshop on Slavic Natural Language Processing (Slavic NLP 2025).: Association for Computational Linguistics, 2025. P. 157–170.
Pre-trained language models have significantly advanced natural language processing (NLP), particularly in analyzing languages with complex morphological structures. This study addresses lemmatization for the Russian language, the errors in which can critically affect the performance of information retrieval, question answering, and other tasks. We present the results of experiments on generative lemmatization using pre-trained language ...
Added: March 10, 2026
Transformer-based approaches for lemmatizing abbreviations in Russian texts
Glazkova A., Lyashevskaya O., Morozov D. et al., Journal of Mathematical Sciences 2025 Vol. 546 P. 32–47
This paper addresses the task of lemmatizing abbreviations in the Russian language. Abbreviation lemmatization is particularly challenging, as it involves not only transforming a word into its normal form but also correctly expanding the abbreviation. We explore two approaches to this task, both leveraging large pretrained language models. The first approach is generative, where the ...
Added: March 10, 2026
Коммуникативная концепция Т. Г. Винокур в контексте прагматической социологии (на примере пьесы Д. Данилова «Сережа очень тупой»)
Nikishina E., В кн.: Говорящий и пишущий: К 100-летию со дня рождения Татьяны Григорьевны Винокур.: М.: Институт русского языка им. В.В. Виноградова РАН, 2024. С. 238–258.
The book is dedicated to the memory of a remarkable Russian language scholar, Tatyana Grigoryevna Vinokur (1924–1992). The range of issues addressed in the collected scholarly articles reflects the breadth of Tatyana Grigoryevna's research interests: the history of language, poetics, the language of fiction, stylistics, speech culture, problems of communication studies, and many other topics. ...
Added: March 8, 2026
Правовое положение соотечественников, проживающих в постсоветских странах, в условиях нестабильной международной обстановки
Затулин К. Ф., Егоров В. Г., Докучаева А. В. et al., М.: Институт диаспоры и интеграции (Институт стран СНГ), 2025.
Книга «Правовое положение соотечественников, проживающих в постсоветских странах, в условиях нестабильной международной обстановки» содержит результаты исследования, проведенного в Абхазии, Азербайджане, Армении, Беларуси, Грузии, Казахстане, Киргизии, Латвии, Литве, Молдове, Приднестровской Молдавской Республике, Таджикистане, Узбекистане, Эстонии и Южной Осетии. Исследование выполнено Институтом диаспоры и интеграции (Институтом стран СНГ) в 2024 году. Оно включило в себя анализ нормативно-правовых ...
Added: February 3, 2026
Методика обучения младших школьников чтению на русском и английском языках: сходство и различие
[б.и.], 2022.
The article highlights the importance of the role of teaching reading to children, its specific features and components; the main methods used in teaching reading to children both in Russian and in English are considered; a comparative characteristic of the two languages is made. In addition, the article also compares the methods of teaching reading ...
Added: January 31, 2026
Explorations in Applied Ethnolinguistics: Words, Cultures, and Global Perspectives
Palgrave Macmillan, 2025.
This volume contributes to the growing body of cutting-edge research into the Natural Semantic Metalanguage (NSM) approach in linguistics. It explores the broad range of possible applications enabled by the NSM approach, from linguistic studies of semantics and culture to cross-cultural studies, psychology and childhood education. The volume builds on previous studies, bringing a diversity ...
Added: January 28, 2026
Semi-fake indexicals in Russian
Tiskin D., Типология морфосинтаксических параметров 2025 Vol. 8 No. 1 P. 112–129
There are several rival theories of fake indexicals, i.e. bound indexicals (prominently pronouns) whose φ-features do not semantically contribute to focus alternatives (e.g. Only Mary did her homework, John didn’t do his). According to Minimal Pronoun theories (such as Kratzer’s or Wurmbrand’s), bound pronouns are Merged without φ-features and acquire them under binding via agreement-like ...
Added: January 26, 2026
Некоторые модификации к теории связанных употреблений индексальных выражений И. Басси
Tiskin D., Типология морфосинтаксических параметров 2024 Т. 7 № 1 С. 107–123
Fake indexicals (FIs), or bound-variable uses of e.g. 1st - and 2 nd -person pronouns, have been analysed by Bassi (2021) as arising from a post-syntactic process of inspecting the features of the referent. This leads to a peculiar analysis of the syntax and semantics of relative clauses containing FIs. I argue for a more ...
Added: January 26, 2026
Проблема формирования национального самосознания у детей в процессе изучения родного языка в трудах К. Д. Ушинского
Бизяева Н. Д., Проблемы современного образования 2025 № 4 С. 134–141
This study is the result of understanding the views of K. D. Ushinsky on the problem of forming national self-awareness in children in the process of studying their native language. It was determined that the idea of nationality, expressed in the theoretical and axiological principles of K. D. Ushinsky, was quite clearly expressed in “The ...
Added: December 16, 2025
Detecting Ethnic Conflict in Social Media with Transformers and Augmented Data
Koltsova O., Surkov A., Procedia Computer Science 2025 Vol. 258 P. 2382–2390
Chest X-ray pathology prediction play a very important role in early disease detection, enabling timely intervention and improving patient outcomes. Detection of ethnic conflict mentioning, discussion, or verbal participation therein in user-generated content is a socially important task, as such content has been proven related to ethnic clashes on the ground. Yet this task has not been ...
Added: November 28, 2025
Речевые акты с вежливыми диминутивами: жанровые и дискурсивные особенности
Fufaeva I., Вестник Волгоградского государственного университета. Серия 2: Языкознание 2025 Т. 24 № 4 С. 78–90
This study delves into speech acts utilizing diminutives for politeness, focusing on their discursive and genre-related aspects. It draws on authorial recordings of spoken discourse, data from the National Corpus of the Russian Language, and recordings of urban speech from the 1970s and late twentieth century. The research highlights the potential usage of polite diminutives in ...
Added: November 25, 2025
Интерпретация сложных предложений с разными типами матричных предикатов в контексте отрицания и модальных операторов
Letuchiy A., Russian Linguistics 2025 Т. 49 № 2 Статья 2
The article discusses types of interpretation that Russian complex sentences with factive,implicative and interpretation verbs get under negation and modal operators. By default,the external negative and modal context affects only the main situation. However, one findsexceptions of this rule. We call ‘transparent readings’ those readings in which the exter-nal context affects semantically both the matrix ...
Added: November 5, 2025
Gender stereotypes in agreement processing with role nouns: a study on Russian
Slioussar N., Antropova D., Frontiers in Psychology 2025 Vol. 16 Article 1619505
The majority of Russian nouns denoting professions and social roles are grammatically masculine. Some of them have feminine pairs, the others do not, but in modern Russian, most nouns in this group can be used to refer to women — either with masculine or with feminine agreement. This option has some interesting limitations that have ...
Added: September 22, 2025
Новые номинации мужчин в молодежном сленге
Krongauz M., Труды института русского языка им. В.В. Виноградова 2025 № 3(45) С. 159–167
The article is devoted to modern youth slang, namely to the nominations of men that have appeared most recently: ank, masik, normis, sigma, skuf, tubik, chechik, shtrikh. It is noted that the words masik, tubik, chechik, shtrikh are often discussed together on the Internet and have common semantic and pragmatic characteristics. They denote types of ...
Added: September 17, 2025
The immediate and the naive metaphysics
Ivan B. Mikirtumov, Epistemology and Philosophy of Science 2025 Vol. 62 No. 3 P. 126–131
In this article, I discuss Pirmin Stekeler-Weithofer’s ideas about the nature of language and the metaphysical residue that seems to be present in the realm of immediate experience, despite all the criticism and success of positive knowledge. This includes, first and foremost, the ability to perceive objects, facts, and possible worlds which humans have from ...
Added: September 1, 2025
Сборник научных трудов по итогам выполнения научно-исследовательской работы в Институте иностранных языков МПГУ за 2024/2025 учебный год
Гаевская М. А., М.: МПГУ, 2026.
The article focuses on the approaches of speech conflict linguistic studies. The author compares the views on the nature of speech conflict both in Russia and abroad. ...
Added: May 12, 2025
Cultural Evaluation of LLMs in Russian: Catchphrases and Cultural Types
Громенко Е. С., Калачева Д. С., Klokova K. et al., , in: Компьютерная лингвистика и интеллектуальные технологии: по материалам ежегодной международной конференции «Диалог» (2025).: [б.и.], 2025.
This study addresses the gap in evaluating large language models' (LLMs) cultural awareness and alignment within the Russian sociocultural context by introducing a structured framework comprising 8 Cultural Types (e.g., Spiritual Practitioner, Soviet Intellectual) and 5 catchphrase groups (e.g., memes, proverbs). A 400-question evalua tion dataset was developed to probe 10 multilingual LLMs, including GPT-4o, ...
Added: May 10, 2025
Evaluating the Pragmatic Competence of Large Language Models in Detecting Mitigated and Unmitigated Types of Disagreement
Shulginov V., Hasan Berkcan Şimşek, Sergei Kudriashov et al., , in: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue” (2025)Issue 23.: [б.и.], 2025. P. 345–360.
This study presents a framework for evaluating the effectiveness of language models (LLMs) in detecting disagreement across a wide range of pragmatic strategies, from mitigated forms to overt verbal aggression. Special attention is given to complex cases of implicit manifestations of irony and sarcasm, which pose significant challenges for both automated analysis and interpersonal communication. ...
Added: April 30, 2025
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit