• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • High-Frequency Multiword Units and the Typological Distribution of Multiword Units in Spoken Russian
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 5, 2026
Neural Network Maps as a Method for Constructing Mathematical Models
Scientists from HSE University–Nizhny Novgorod and the Institute of Physics Belgrade, Serbia, are jointly exploring the application of machine learning techniques and neural networks to the study of nonlinear dynamics. Natalya Stankevich, Leading Research Fellow at the Laboratory of Topological Methods in Dynamics of the Faculty of Informatics, Mathematics, and Computer Science at HSE University–Nizhny Novgorod, spoke to the HSE News Service about this international project.
June 5, 2026
‘In the Age of Technology, It Is Interesting to Look into the Past and Think about What We Can Take from It
Polina Tabakova decided to apply for a Philology degree at HSE in Nizhny Novgorod because she grew up in Mari El and did not want to move far away from the Russian forests. In an interview for the Young Scientists of HSE University project, she spoke about the genre of the campus novel, the existential drama of Kolobok, and a blackout version of Eugene Onegin.
June 5, 2026
HSE Scientists Develop Method to Compress Large Language Models Without Losing Quality
Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science have developed a new compression method for large language models such as GPT and LLaMA that reduces their size by 25–36% without additional training or significant loss of accuracy. This is the first approach to use mathematical transformations—specifically, rotations of model weights—to make models more amenable to compression with structured matrices. The study results have been published in ACL Findings 2025. The code is available on GitHub.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

High-Frequency Multiword Units and the Typological Distribution of Multiword Units in Spoken Russian

P. 257–270.
Natalia V. Bogdanova-Beglarian, Blinova O. V., Khokhlova M., Sherstinova T., Tatiana I. Popova

Multiword units (MWUs) constitute a distinct class of linguistic phenomena located at the crossroads of lexis and syntax. Empirical data on their typology and frequency are essential for solving a wide range of applied problems in natural language processing. This paper presents a corpus-based study of MWUs in Russian everyday speech. Drawing on data from the ORD corpus comprising one million words of transcribed spontaneous discourse, over 8,000 MWU instances were identified and annotated. These MWUs are classified into eight main classes: non-phraseologized collocations, phraseologized collocations, occasional collocations, idiom forms, constructions, precedent texts and their elements, multiword pragmatic markers, and speech formulas. The paper presents a ranked list of the 50 most frequent MWUs in spoken Russian, along with the overall distribution of MWU types. The results indicate that pragmatic markers are the most dominant category (comprising over 30% of all MWUs), followed by non-phraseologized collocations (26%) and speech formulas (21%). The article also discusses the functional combinations of MWUs in spoken interaction and highlights precedent texts as one of the productive sources for MWU formation. The quantitative data obtained in this study contribute to both theoretical models of lexical and grammatical description of Russian everyday speech and practical tasks related to processing and generating spontaneous spoken language.

Language: English
DOI
Text on another site
Keywords: statistical analysiscorpus lingusticsoral discourseMultiword units

In book

27th International Conference, SPECOM 2025, Szeged, Hungary, October 13–15, 2025, Proceedings, Part II. Speech and Computer. Lecture Notes in Artificial Intelligence 16188
Vol. 16188. , Springer, 2025.
Similar publications
О возможности применения сверточных нейронных сетей к построению универсальных атак на итеративные блочные шифры
Perov A., Пестунов А. И., Прикладная дискретная математика 2020 № 3 С. 46–56
The paper explores possibility of applying convolutional neural networks to the security analysis of iterative block ciphers. A new approach for constructing distinguishing attacks based on a convolutional neural network is proposed. The approach is based on distinguishing between graphic equivalents of ciphertexts received by the CTR (counter) encryption mode after different number of rounds, including the number of ...
Added: November 1, 2021
Communication Failures in Everyday Conversations: a Case Study Based on the “Retrospective Commenting Method”
Mustajoki A., Cherkunova N., Sherstinova T., , in: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue” (2021)Issue 20: Основной том.: -, 2021. P. 514–523.
The paper deals with communication failures in everyday spoken discourse. The spontaneous character of oral speech is its basic property and becomes a prerequisite for the appearance of such a phenomenon as communicative failures. By communicative failures, we mean speech situations when the recipient of a speech message does not understand it correctly, i.e., in ...
Added: August 31, 2021
In Search of Basic Units of Spoken Language: A Corpus-Driven Approach
John Benjamins Publishing Company, 2020.
What is the best way to analyze spontaneous spoken language? In their search for the basic units of spoken language the authors of this volume opt for a corpus-driven approach. They share a strong conviction that prosodic structure is essential for the study of spoken discourse and each bring their own theoretical and practical experience ...
Added: October 10, 2019
Комплекс построек гляденовского времени на Мокинском I поселении-могильнике в контексте развития прикамского домостроительства
Васильева А. В., Mingalev V. V., Перескоков М. Л., Вестник Пермского университета. Серия: История 2018 № 1(40) С. 44–61
The article analyzes the complexes of objects discovered during the 2015-2017 excavations in Mokino that can be interpreted as the remains of the settlement of the Glyadenovo culture preceding the construction of the cemetery. Two construction horizons were revealed, reflecting the stages of the settlement’s functioning. The building No. 2 belongs to the early horizon. ...
Added: April 17, 2018
Статистический анализ и моделирование изменчивости качества сточных вод в системе производственного водоотведения
Kopnova E., Rodionova L., Вопросы статистики 2016 № 4 С. 41–50
This article presents results of the study on economic and statistical justification for improvement of water and environmental management of an industrial enterprise. As a main tool the authors applied - was the method for modeling time series using stationary stochastic processes. The models of the integrated auto-regression and moving average, seasonally adjusted were used ...
Added: May 25, 2016
Extraction of Cause – Effect Relationships from Psychological Test Data Using Logical Methods
Panov A. I., Scientific and Technical Information Processing 2014 Vol. 41 No. 5 P. 275–282
A shared use of the AQ learning and JSM method in extracting cause–effect relationships from psychological test data is considered. The AQ learning is used to descript a test group using rules. The group description is a basis for constructing the factbase of the JSM method. The first stage of the JSM method is used ...
Added: November 20, 2015
Causal Analysis, History of
Liao T., Inna F. Deviatko, , in: International Encyclopedia of the Social & Behavioral Sciences (Second Edition). Elsevier, 2015.: Elsevier, 2015. P. 247–250.
This article examines the history of causal research traditions in the social sciences. We identify two major bases for the methods and logic of causal analysis in the social sciences – experimental designs and statistical methods – and discuss the developments in these two correlated research traditions, especially the implications of these developments for the ...
Added: August 20, 2015
International Scientific Conference “Globalization and Statistics”.Proceedings
Tbilisi: Tbilisi State University, 2014.
Proceedings of the International scientific conference "Statistics and globalization" , posvjashennoi the 70th anniversary of the founding of the Department of Statistics, Tbilisi state University. The conference was included in the calendar of conferences of the International statistical Association. In the reports of the conference reflected the problems of theory and methodology of modern statistics, ...
Added: November 18, 2014
Уровень экономического развития как фактор динамики уровня демократии в посткоммунистических государствах
Kamalova R., В кн.: Труды семинара «Математическое моделирование политических систем и процессов»Вып. 2.: М.: Издательство Московского университета, 2013. С. 69–90.
The article presents contemporary approaches to measuring democracy levels and the impact of economic development on democracy. ...
Added: December 23, 2013
Dominant Party Rule and Legislative Leadership in Authoritarian Regimes
Reuter O. J., Turovsky R. F., Party Politics 2014 No. 20 P. 663–674
Authoritarian dominant parties are said to ensure elite loyalty by providing elites with regularized opportunities for career advancement. This article uses data on the distribution of leadership posts in Russia’s regional legislatures (1999–2010) to conduct the first systematic test of this proposition. Loyalty to the nascent hegemonic party, United Russia, is shown to be important ...
Added: November 18, 2013
Отдых россиян за рубежом: о чем говорят статистические данные?
Rodionova L., Вопросы статистики 2013 № 11 С. 23–30
In the paper tourist trips of Russians abroad were studied and major trends were identified. The main indicators of international tourism in the world and Russia were analyzed. The impact of socio-demographic characteristics of the individuals on the likelihood of a tourist trip abroad was simulated on the basis of binary choice models. ...
Added: November 15, 2013
Графический метод анализа динамики структуры бюджетных расходов
Korneychuk B. V., В кн.: Финансовые проблемы и пути их решения: теория и практика. Сборник научных трудов 14-й Международной научно-практической конференции 23-25 апреля 2013 года.: СПб.: Издательство Политехнического университета, 2013. С. 18–26.
Предложен метод графического представления и анализа структуры бюджетных расходов, использующий расчет трех основных характеристик динамики структуры в многомерном пространстве: сдвиг, отклонение и удаление. Метод применен при анализе динамики бюджетных расходов в США в 2000-2011 гг. ...
Added: April 25, 2013
Информатизация корпоративного планирования и бюджетирования
Isaev D., Бизнес-информатика 2013 № 1 (23) С. 58–63
In the paper integrated information systems for corporate planning and budgeting are considered. Four groups of practical tasks exceeding the bounds of typical functionality of special-purpose planning and budgeting information systems are allocated. Several classes of information systems (simulation, statistical analysis, financial analysis and modeling, group decision making, business intelligence), which may provide the completeness ...
Added: April 8, 2013
Marketing management: selected issues
Bielsko-Biala: ATH University, 2012.
The certain trends led to the economic crisis began in 2008, which still continues, although with less intensity and for a long time its negative consequences, both for individual companies and industries and the economy will be felt in different ways and at different extent. This situation requires exploration and continuous improvement of mechanisms, processes, ...
Added: March 26, 2013
The structure and dynamics of the workers' protest movement at the beginning of the 20th century in Russia. Database analysis
Borodkin L. I., Shilnikova I., Irina M. P., / Series IISH "Research Paper". 2012. No. ISSN 0927-4618.
The article gives the first in the Russian and international research literature statistical analysis of the structure and dynamics of the Russian workers’ protest movement in 1895-1904. This pre-revolutionary decade had an impact on the further events in the history of the country. The analysis is based on the authors built database that includes information ...
Added: March 22, 2013
О системе статистических показателей деятельности судов общей юрисдикции РФ и судимости
Savyuk L. K., В кн.: Уголовное право и современность. Сборник статейВып. 3.: М.: Юрист, 2011. С. 90–106.
Определяется понятие статистического показателя; обосно вывается их система, роль и значение для оценки деятельности судов общей юрисдикции всех инстанций; раскрываются особенности формирования статистической отчетности по структуре признаков статистической карточки на под судимого, рассчета коэффициента судимости, его использования в аналитической работы судов общей юрисдикции. ...
Added: March 4, 2013
An SEM Approach to Continuous Time Modeling of Panel Data: Relating Authoritarianism and Anomia
Voelkle M. C., Oud J. H., Davidov E. et al., Psychological methods 2012 Vol. 17 No. 2 P. 176–192
Panel studies, in which the same subjects are repeatedly observed at multiple time points, are among the most popular longitudinal designs in psychology. Meanwhile, there exists a wide range of different methods to analyze such data, with autoregressive and cross-lagged models being 2 of the most well known representatives. Unfortunately, in these models time is ...
Added: February 4, 2013
Статистический анализ приоритетных отраслей развития малых и средних предприятий
Alimova T. A., Popovskaya E. V., Вопросы статистики 2003 № 11 С. 35–47
Малые и средние предприятия – основные клиенты сервисных организаций, являющихся звеньями инфраструктуры поддержки предпринимательства. Особую важность имеет вопрос об отраслевой специфике развития предпринимательства. Именно в тех отраслях, где бизнес развивается опережающими темпами, следует ожидать увеличения спроса на бизнес- услуги. Информированность о приоритетных направлениях развития сектора малых и средних предприятий позволит сервисным организациям сформировать пакет услуг, ...
Added: December 14, 2012
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit