• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Articles
  • Extending PhpMorhy Dictionary With Dialect Words
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 14, 2026
Resource Race and Green Transition: Three Unexpected Conclusions from Foresight Centres Research on Climate and Poverty
Beneath the surface of green energy—which most people associate with solar panels, electric vehicles, and reduced CO2 emissions—lies a complex web of geopolitical interests, international inequality, and resource constraints. Researchers from the Laboratory for Science and Technology Studies (LST) at the HSE ISSEK Foresight Centre have published a series of articles in leading international journals on hidden and overt conflicts surrounding critically important metals and minerals, as well as related processes in the energy sector.
May 13, 2026
Immersion in Second Language Environment Influences Bilinguals Perception of Emotions
Researchers at the Cognitive Health and Intelligence Centre at the HSE Institute for Cognitive Neuroscience have discovered how bilingual individuals process emotional words in their native (first) and non-native (second) languages. It was found that the link between word meaning and bodily sensations is weaker in a second language than in a first language. However, the more a person is immersed in a language environment, the smaller this difference becomes. The article has been published in Language, Cognition and Neuroscience.
May 12, 2026
‘Any Real-Economy Company Can Use Our Products
The HSE Centre for Financial Research and Data Analytics combines fundamental and applied work, including in areas unique to Russia such as the connection between sentiment in the media and social networks and financial markets. The HSE News Service spoke with the centre’s director, Professor Tamara Teplova, about its work.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Extending PhpMorhy Dictionary With Dialect Words

Journal of Physics: Conference Series. 2020. No. 1680.
Zemicheva S., Громов М. Л.

This paper describes the preparatory work before adding dialect words to the internal dictionary of phpMorphy. The work consisted in checking, which dialect words are missing in the phpMorphy dictionary and developing a tool that should facilitate the process of adding new words. PhpMorphy is planned to be used as the main element of the text corpus search system.

Language: English
Text on another site
Keywords: corpus linguisticsRussian Siberian dialectselectronic dictionaryphpMorphymorphological annotation
Similar publications
Российская социология в условиях цифровизации общества: результаты анализа корпуса научных текстов
Smirnov A., Социологические исследования 2023 № 4 С. 39–50
Using the analysis of a corpus of texts from eight leading Russian sociological journals, the article examines the impact of the digitalization of society on sociology in 2000–2021. Frequency analysis of 13.8 thousand scientific texts tracked the introduction of concepts related to digitalization into academic circulation. The article reveals the differences between the journals, due ...
Added: March 18, 2026
Promotional adjectives in grant proposal abstracts: a corpus study
Dmitriy S. Tulyakov, Tatiana M. Permyakova, Ekaterina A. Balezina, Вестник Волгоградского государственного университета. Серия 2: Языкознание 2025 Vol. 24 No. 6 P. 58–67
By effectively integrating promotional discourse into grant proposal abstracts, researchers can more compellingly present their ideas and increase their chances of securing funding. Implications of promotional adjectives in grant writing might differ across various research fields. This study aims to explore the use of promotional adjectives in abstracts of research grant proposals in six research ...
Added: March 2, 2026
Динамика восприятия площадей в пространстве города носителями русского языка (сравнительный анализ по данным НКРЯ)
Belova P., В кн.: Актуальные вопросы лингвистики и литературоведения: сборник научных статей по материалам международной научной конференции памяти доктора филологических наук, профессора Л.А. Араевой (6–8 февраля 2025).: Кемеровский государственный университет, 2025. С. 155–160.
This article contains research results on the dynamics of squares’ perception in the city space in the Russian language picture of the world over time, starting from the second half of the XXth century to the present. Turning to the subcorpus of literary texts of the second half of the XXth century and the XXIst ...
Added: February 4, 2026
Preposition drop in Russian spoken by Mari and Beserman bilinguals
Yakovleva A., Kosheliuk N., Moroz G., International Journal of Bilingualism 2025 P. 1–19
Aims and Research Questions: In this paper, we present a corpus-based study of preposition drop (p-drop) in the speech of Mari-Russian and Beserman-Russian bilinguals compared to the speech of Russian monolinguals. Based on data from spoken corpora, we demonstrate that the prepositions v ‘in’, k ‘to’, s ‘with’ are omitted in the speech of bilinguals ...
Added: November 26, 2025
Вариативность годов vs. лет в русских говорах: корпусное исследование
Zemicheva S., Moroz G., Naccarato C., Вопросы языкознания 2025 № 6 С. 7–34
Наличие супплетивной формы лет в парадигме существительного год отличает русский язык от других восточнославянских. При этом в русских говорах вместо лет может использоваться вариант годов. Данные панхронического подкорпуса НКРЯ показывают, что форма годов, зафиксированная впервые в XV в., на всем протяжении истории русского языка была периферийной, в XVII–XVIII вв. использовалась преимущественно в нехудожественных текстах, а в ...
Added: November 12, 2025
Automatic Annotation of Discourse and Speech Formulas in Internet Communication: A Telegram Comment Corpus
Maslenikova A., Tatiana I. Popova, , in: 27th International Conference, SPECOM 2025, Szeged, Hungary, October 13–15, 2025, Proceedings, Part I. Speech and Computer. Lecture Notes in Artificial Intelligence 16187Vol. 16187: Lecture Notes in Artificial Intelligence.: Springer, 2025. P. 278–292.
This article presents a system for the automatic processing of user comments aimed at annotating speech and discourse formulas that actively function in everyday interaction, including digital communication. A Python-based program using the Telegram API was developed to automate the collection, filtering, and annotation of empirical data. In addition to building a user corpus, the ...
Added: October 19, 2025
27th International Conference, SPECOM 2025, Szeged, Hungary, October 13–15, 2025, Proceedings, Part II. Speech and Computer. Lecture Notes in Artificial Intelligence 16188
Springer, 2025.
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or ...
Added: October 19, 2025
Variation in a Narrative Corpus of Mano and Kpelle: Contact-Induced or Not?.
Khachaturyan M., Konoshenko M., Moroz G. et al., , in: N’yng-dyuumgu, n’yng-ngafq: Festschrift for Ekaterina GruzdevaVol. 126.: Helsinki: Studia Orientalia, 2025. P. 35–59.
This paper explores a corpus of spontaneous narratives and narrative retellings told by children and adults in Mano and Kpelle, two contacting Mande languages. It focuses on quotative constructions as a key point of grammatical dissimilarity between Mano and Kpelle. In the Mano speech of some bilingual children, however, these constructions are found to manifest ...
Added: September 5, 2025
Переписка Н. С. Хрущева и Ф. Кастро периода Карибского кризиса: опыт компьютеризованного анализа
Герцен А. С., В кн.: Четвёртая зимняя школа по гуманитарной информатике.: Балтийский федеральный университет им. Иммануила Канта, 2020. С. 92–97.
The article analyzes the 1st Secretary of the Central Committee of the CPSU and Chairman of the Council of Ministers of the USSR N. S. Khrushchev and the leader of the Cuban revolution F. Castro Ruz’s letters written in the period from October 26 to 31, 1962 on the topic of the Caribbean crisis and ...
Added: July 15, 2025
An overview of morphosyntactic variation in the speech of Russian-Chuvash bilinguals: number, gender, case assignment and preposition drop
Grishanova A., Russian linguistics 2025 Vol. 49 Article 10
The purpose of this study is to present a summary of morphosyntactic variation and a detailed analysis of the phenomenon of preposition drop in the Russian speech of Chuvash bilinguals. Specifically, I investigate what underlying factors might condition the variation. I conduct a qualitative analysis of the data extracted from the corpus of Russian spoken ...
Added: July 10, 2025
Do Formal Stance Strategies Reveal Disciplinary Variation in Professional Scientific Writing?
Smirnova E. A., Pérez-Guerra J., International Journal of Applied Linguistics 2025 Vol. 35 No. 3 P. 1242–1261
Stance in academic discourse has been extensively studied, with numerous investigations indicating that its expression varies across disciplines, depending on the authors’ intention to either enhance or diminish their voice or presence (e.g. It seems fairly certain versus This is based on the belief that...). This paper hypothesises that stance can be viewed as a ...
Added: April 10, 2025
Русский язык в условиях контактирования: тюркско-русское языковое взаимодействие. Часть 1. Социолингвистическое и корпусное исследование
Резанова З. И., Artemenko E., Диброва В. С. et al., Томск: Издательство Томского государственного университета, 2024.
В монографии представлены собственно лингвистические, социолингвистические и психолингвистические аспекты взаимодействия русского и трех тюркских языков – шорского, хакасского, татарского (сибирского варианта). Охарактеризованы варианты влияния тюркских языков на речевую практику и когнитивные процессы порождения и восприятия речи русскоязычными билингвами. Представлены методики сбора данных, их обработки при формировании социолингвистической базы данных и морфологически размеченного бимодального корпуса русской устной речи билингвов, ...
Added: April 7, 2025
The ‘adverb-ly adjective’ construction in English: meanings, distribution and discourse functions
Taboada M., Goddard C., Trnavac R., English Language and Linguistics 2025 Vol. 29 No. 1 P. 102–131
We investigate a class of adjective phrases composed of a deadjectival adverb ending in -ly and an adjective head (e.g. staggeringly incompetent, absolutely terrific, fiscally responsible), a compact construction whereby two adjectives may jointly contribute to evaluative meaning. Using corpus methodologies on more than 1 million examples and relying on semantic analyses of about 1,000 instances, we propose that the ...
Added: April 4, 2025
Standard Dargwa Corpus
Toldova S., Sokur E., , in: Современная лингвистика: от теории к практике: III Казанский международный лингвистический саммит: (Казань, 14–19 ноября 2022 г.): тр. и матер.: в 3 т.Т. 1.: Каз.: Издательство Казанского университета, 2023.
This paper describes an ongoing project of creating Standard Dargwa Corpus. Dargwa < East Caucasian is one of the written languages of the Republic of Daghestan (Russia). Standard Dargwa is the standardized language used in writing: in the Soviet period it was created based on the dialect of Aqusha. It has official status and is ...
Added: March 12, 2025
Creation and Analysis of the Multimedia Russian Corpus for Gesture Research
Rakhilina E. V., Cienki A., , in: The Cambridge Handbook of Gesture Studies.: Cambridge University Press, 2024. P. 249–272.
The chapter considers gesture studies in relation to corpus linguistic work. The focus is on the Multimedia Russian Corpus (MURCO), part of the Russian National Corpus. The chapter includes a brief biography of the creator of this corpus, Elena Grishina. The compilation of the corpus out of a set of Russian classic feature films and ...
Added: February 13, 2025
Non-standard numeral constructions in L2 Russian: A corpus-based study
Naccarato C., Moroz G., International Journal of Bilingualism 2026 Vol. 30 No. 2 P. 358–379
Aims and Research Questions: The paper investigates variation in numeral constructions in the L2 Russian speech of bilinguals from different regions of Russia. The main research questions are the following: What factors prompt variation in this domain of grammar? Can we argue that non-standard marking is motivated by contact?   Methodology: We conduct a corpus-based study ...
Added: January 24, 2025
ИСПОЛЬЗОВАНИЕ МЕТОДОВ КОМПЬЮТЕРНОЙ ЛИНГВИСТИКИ ДЛЯ АНАЛИЗА ЛИТЕРАТУРЫХ ТЕКСТОВ
Аванесян Н. Л., Fokina A., Chepovskiy A., В кн.: Инжиниринг предприятий и управление знаниями (ИП&УЗ-2024) : сборник научных трудов XXVII Российской научной конференции. 28–29 ноября 2024 г. / под науч. ред. Ю. Ф. Тельнова. – Москва : ФГБОУ ВО «РЭУ им. Г. В. Плеханова», 2024.: М.: ФГБОУ ВО "РЭУ им. Г.В. Плеханова", 2024. С. 15–18.
Статья  посвящена  применению  математических  методов  корпусного  анализа  для  исследований литературных текстов. На примере созданных корпусов продемонстрированы  возможности  применения  метода  анализа  соответствий  и  анализ  коэффициентов  попарной  ранговой  корреляции  для  сравнения  частотных  характеристик  текстов  различных подкорпусов.  Описанные  методики  дают  коррелированные  результаты.  Они  могут  использоваться  как  для  лингвистических  исследований,  так  и  создания  корректных обучающих текстовых наборов для задач искусственного интеллекта. ...
Added: December 19, 2024
Корпусная лингвистика на современном этапе
Plungian V., Вестник Российской академии наук 2024 Т. 94 № 9 С. 787–794
Даётся общее представление о корпусной лингвистике, её истории, методах и влиянии на современные представления об изучении языка, которое обычно обозначается как “корпусная революция”. ...
Added: December 16, 2024
Популистский текст как объект корпусного исследования
Галочкин А. Е., В кн.: ЧЕЛОВЕК В СИСТЕМЕ КОММУНИКАЦИЙ: ПРОФЕССИОНАЛЬНЫЕ КОММУНИКАЦИИ В ЦИФРОВУЮ ЭПОХУ.: Нижегородский государственный лингвистический университет им. Н.А. Добролюбова, 2023. С. 87–90.
This article discusses the phenomenon of populism in the context of corpus linguistics methods, which is of particular importance in the modern world. The relevance of this study is related to the growth of right-wing populism in European countries and the importance of understanding the mechanisms of populist discourse. The article analyzes studies aimed at ...
Added: November 16, 2024
Коньячку бы, да до дому: хронология развития некоторых форм второго родительного падежа
Budennaya E., Труды института русского языка им. В.В. Виноградова 2024 № 2(40) С. 261–282
The article based on the material form Russian National Corpus discusses the diachronic development of structures with Russian second genitive case in three types of contexts: 1) with nominal quantifiers; 2) with the preposition bez  ‘without’; 3) with the preposition do ‘towards’. The data obtained from Russian language are compared with the data from other languages (Finnic and several Turkic), in which there is a tendency to use the partitive ...
Added: October 4, 2024
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit