• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Articles
  • Written vs generated text: “naturalness” as a textual and psycholinguistic category
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 13, 2026
Immersion in Second Language Environment Influences Bilinguals Perception of Emotions
Researchers at the Cognitive Health and Intelligence Centre at the HSE Institute for Cognitive Neuroscience have discovered how bilingual individuals process emotional words in their native (first) and non-native (second) languages. It was found that the link between word meaning and bodily sensations is weaker in a second language than in a first language. However, the more a person is immersed in a language environment, the smaller this difference becomes. The article has been published in Language, Cognition and Neuroscience.
May 12, 2026
‘Any Real-Economy Company Can Use Our Products
The HSE Centre for Financial Research and Data Analytics combines fundamental and applied work, including in areas unique to Russia such as the connection between sentiment in the media and social networks and financial markets. The HSE News Service spoke with the centre’s director, Professor Tamara Teplova, about its work.
May 7, 2026
Researchers Find More Effective Approach to Revealing Majorana Zero Modes in Superconductors
An international team of researchers, including physicists from HSE MIEM, has demonstrated that nonmagnetic impurities can help more accurately reveal Majorana zero modes—quantum states considered promising building blocks for quantum computing. The researchers found that these impurities shift the energy levels that typically obscure the Majorana signal, while leaving the mode itself largely unaffected, thereby making its spectral peak more distinct. The study has been published in Research.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Written vs generated text: “naturalness” as a textual and psycholinguistic category

Научный результат. Серия: Вопросы теоретической и прикладной лингвистики. 2024. Vol. 10. No. 2. P. 71–99.
Kolmogorova A. V, Margolina A. V.

In the context of the development of text generation technologies, the opposition “naturalness − unnaturalness of text” has been transformed into a new dichotomy: “naturalness – artificiality”. The aim of this article is to investigate the phenomenon of naturalness in this context from two perspectives: analyzing the linguistic characteristics of a natural text against a generated (artificial) text and systematizing introspective perceptions of Russian native speaker informants as to what a “natural” text should be like and how it should differ from a generated text. The material for the study was a parallel corpus of film reviews in Russian, consisting of two subcorpora: reviews written by people and those generated by a large language model based on prompts, which are the beginnings of reviews, from the first subcorpus. The following methods were applied for the comparative analysis of the two subcorpora: computer-assisted text processing for calculating the values of 130 metrics of text linguistic complexity, psycholinguistic experiment, expert text analysis, contrastive analysis. As a result, it was determined that from the point of view of their own linguistic characteristics, “natural” texts differ from generated texts mainly by greater flexibility of syntactic structure, allowing both omission or reduction of structures and redundancy, as well as by slightly greater lexical variability. Naturalness as a psycholinguistic category is related to the informants’ autostereotypical ideas about the cognitive characteristics of people as a species. The analysis of texts erroneously attributed by informants (generated, labelled as natural and vice versa) showed that a number of characteristics of this autostereotype are overestimated by informants, while others, in general, correlate with the linguistic specificity of texts from the subcorpus of written reviews. In conclusion, we formulate definitions of naturalness as a textual and psycholinguistic category.

Research target: Philology and Linguistics
Language: English
Full text
DOI
Text on another site
Keywords: русский языкэксперименттекстовая категорияexperimentRussian languageControlled generationMetrics of text complexityNaturalnessPsycholinguistic categoryText categoryконтролируемая генерацияестественностьпсихолингвистическая категорияметрики лингвистической сложности;
Publication based on the results of:
Text as Big Data: methods and models for big text data analysis (2024)
Similar publications
Образ женщины сквозь года: диахронический анализ репрезентации женщин в российской агитационной рекламе
Gabrielova E., Максименко О. И., Социальные и гуманитарные науки на Дальнем Востоке 2026 Т. 23 № 1 С. 241–249
The article presents a diachronic analysis of the representation of women in Russian advertising, based on agitation posters from 1917-1990 and social and motivational advertising materials from 2000-2020. The aim of the study is to identify the evolution of verbal and visual strategies for constructing the image of women in the changing socio-political and cultural ...
Added: May 13, 2026
Proceedings of the 9th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing
Velichkov B., Nikolova-Koleva I., Slavcheva M., INCOMA Ltd, 2025.
The RANLP 2025 Student Research Workshop (RANLPStud’2025) is a special track of the established international conference Recent Advances in Natural Language Processing (RANLP’2025). The RANLPStud is being organised for the 9th time and this year is running in parallel with the other tracks of the main RANLP 2025 conference. The target of RANLPStud’25 is to be a ...
Added: May 12, 2026
«Плоский мир» Т. Пратчетта глазами русскоязычного фандома
Кульков А. Н., Tsvetkova M. V., Вестник Томского государственного университета. Филология 2026 № 100 С. 158–173
Впервые делается попытка рассмотреть особенности фанфикшн как акта продуктивной рецепции, возникшего на основе цикла романов Терри Пратчетта о Плоском мире в России. Проведенный анализ показывает, что прежде всего авторы фанфиков стремятся передать стилистику и комическое начало оригинального цикла Пратчетта, вне зависимости от жанра и формата создаваемых ими произведений. Фикрайтеры наиболее часто обращаются к таким форматам, ...
Added: May 10, 2026
Вселенная Достоевского
Pershkina A., М.: Альпина нон-фикшн, 2026.
Филолог Анастасия Першкина рассказывает о том, как писатель создавал свой мир, кем его населил, какие законы установил и почему этот мир так ярко действует на нас. Кроме того, вы узнаете, кто помогал Федору Михайловичу работать, как писатель связывал между собой произведения, что думали о его текстах современники и что же такое достоевщина. ...
Added: May 6, 2026
The hypothesis of dependence of the lexical nature of mixed languages on the patterns of their emergence
Gridneva E., Vestnik Tomskogo Gosudarstvennogo Universiteta, Filologiya 2026 No. 100 P. 38–52
This study investigates mixed languages, with a specific focus on their lexical characteristics. It proposes and substantiates the hypothesis that the degree of lexical mixing in such languages — reflected in the prevalence of doublets and the distribution of vocabulary between source languages — is linked to the specific pattern of their emergence, rather than ...
Added: May 6, 2026
Арест писателя Гюнтера Хофе на франкфуртской книжной ярмарке в 1963 г.: конкурирующие образы в медийном пространстве ГДР и ФРГ
Керимов Р. Э., Новое прошлое 2026 № 1 С. 148–162
The arrest of East German writer and publishing director Günter Hofé at the 1963 Frankfurt Book Fair became a unique episode of ideological confrontation between East and West Germany. Hofé is primarily known for his documentary-fiction trilogy about World War II, in which he actively participated as a Wehrmacht soldier. The analysis of the writer’s ...
Added: May 5, 2026
Семантический ореол сакрального в четырехстопном амфибрахии: механизмы культурной памяти в поэзии Ольги Седаковой
Максимов И. В., Новый филологический вестник 2025 Т. 73 № 2 С. 187–196
The majority of studies on the metrical aspects of Olga Sedakova’s poetry focus on the formal elements of versification, rarely exploring the substantive possibilities of the chosen metres. This paper fills this gap by analyzing the unified narrative of the four-foot amphibrach, tracing its development in Russian poetry from V.A. Zhukovsky to O.A. Sedakova. At ...
Added: May 5, 2026
Кубанская стела (Musée des Beaux Arts Grenoble, Collection égyptienne, inv. 1937, 1969, 3565)
Крол А. А., Кузнецов Д. А., Ladynin I. A., Восток. Афро-азиатские общества: история и современность 2026 Т. 1 С. 244–261
The publication presents a new translation and commentary of the Quban Stela of Ramesses II (Musée des beaux-arts Grenoble, Collection égyptienne, inv. 1937, 1969, 3565). This monument dates to the beginning of his reign (ca. 1287 BC); it was found near the ruins of the fortress of Baki, close to the Nubian village of Kuban. The composers of the ...
Added: May 5, 2026
Царь Рамсес и Бактрия. Об одном мотиве позднеегипетского историописания
Ladynin I. A., Вестник древней истории 2024 Т. 84 № 1 С. 5–26
The article analyses a set of Classical evidence reflecting the Egyptian conquest of Bactria or its attempt (Diod. I. 46–47; Tac. Ann. II. 60. 3; Strabo XVII. 1. 46), a statement of Manetho of Sebennytos on the vast conquests of king Sethos-Ramesses (I) (Manetho. Frg. 50 = Ios. C.Ap. I. 15. § 98–102), and the ...
Added: May 5, 2026
Цикл И. Бабеля «Великая Криница»: темпоральная структура в свете модерна.
Гендлина В. В., Новый филологический вестник 2025 № 1 С. 144–154
В статье анализируются две новеллы Исаака Бабеля начала 1930-х гг. о коллективизации -- «Гапа Гужва» и «Колывушка». Новеллы должны были стать частью цикла о коллективизации под общим названием «Великая Криница», однако замысел книги о преобразованиях в советской деревне оказался невоплощенным. В обеих новеллах Бабель показывает грандиозный проект модернизации колхозов как процесс, разрушающий существующий порядок и жизнь отдельно ...
Added: May 4, 2026
К вопросу о частеречной принадлежности и именовании нефинитных форм в лесном ненецком языке
Starchenko A., Kozlov A., Белов П. А., Известия РАН. Серия литературы и языка 2026 Т. 85 № 1 С. 77–97
The article examines the problem of part-speech classification and the terminological description of non-finite forms in Forest Nenets, drawing on new data from the Pur dialect. The study analyzes the system of Forest Nenets non-finite forms, which includes action nouns, participles, the gerund, the conditional form, and the supine. The analysis is carried out within ...
Added: May 4, 2026
РЕЧЕВЫЕ АКТЫ С ВЕЖЛИВЫМИ ДИМИНУТИВАМИ: ЖАНРОВЫЕ И ДИСКУРСИВНЫЕ ОСОБЕННОСТИ
Fufaeva I., Вестник Волгоградского государственного университета. Серия 2: Языкознание 2025 Т. 24 № 4 С. 78–90
The study delves into speech acts with diminutives used for politeness, focusing on their discursive and genre-related aspects. It draws on authorial recordings of colloquial speech, data from the National Corpus of the Russian Language, and recordings of urban speech from the 1970s and late twentieth century. The research highlights the potential usage of polite ...
Added: May 2, 2026
Искусственный интеллект как инструмент дифференциации при обучении иностранному языку
Bogolepova S., Smirnova A., Иностранные языки в школе 2026 № 4 С. 5–11
Differentiation in foreign language teaching is essential for accommodating individual trajectories of communicative competence development; however, its implementation is hindered by teachers' lack of time, resources, and training. Artificial intelligence (AI) helps overcome these barriers by enabling differentiation across content, process, and product. The article illustrates practical techniques supported by AI, including sample prompts and ...
Added: May 1, 2026
Роль плодовых культур и мелких домашних животных в экспериментальных исследованиях речи
Khudyakova M., Gomozova M., Корженевская А. Ю. et al., В кн.: Миратив, нарратив, креатив: Сборник статей к юбилею Миры Бергельсон.: М.: Буки Веди, 2026. С. 174–183.
Fruit trees, cats, and other similar visual stimuli have demonstrated remarkable productivity in experimental research of the Center for Language and Brain, HSE University. Mira Bergelson played a key role in this pear-cat revolution: from introducing the "Pear Stories" film into clinical research to integrating cats into various tools for assessing language production and comprehension. These materials allowed ...
Added: April 8, 2026
Дискриминативная лемматизация сокращений в эпоху LLM
Глазкова А. В., Смаль И. В., Lyashevskaya O. et al., Доклады Российской академии наук. Математика, информатика, процессы управления (ранее - Доклады Академии Наук. Математика) 2025 Т. 527 С. 146–155
This paper presents a study on the effectiveness of discriminative methods for abbreviation lemmatization in Russian texts. Unlike generative approaches, discriminative models select the optimal lemma from a fixed set of candidates, eliminating the risk of generating grammatically incorrect word forms. For the first time in Russian language processing, we conduct a comprehensive analysis of ...
Added: March 10, 2026
Rubic2: Ensemble Model for Russian Lemmatization
Afanasev I., Glazkova A., Lyashevskaya O. et al., , in: Proceedings of the 10th Workshop on Slavic Natural Language Processing (Slavic NLP 2025).: Association for Computational Linguistics, 2025. P. 157–170.
Pre-trained language models have significantly advanced natural language processing (NLP), particularly in analyzing languages with complex morphological structures. This study addresses lemmatization for the Russian language, the errors in which can critically affect the performance of information retrieval, question answering, and other tasks. We present the results of experiments on generative lemmatization using pre-trained language ...
Added: March 10, 2026
Transformer-based approaches for lemmatizing abbreviations in Russian texts
Glazkova A., Lyashevskaya O., Morozov D. et al., Journal of Mathematical Sciences 2025 Vol. 546 P. 32–47
This paper addresses the task of lemmatizing abbreviations in the Russian language. Abbreviation lemmatization is particularly challenging, as it involves not only transforming a word into its normal form but also correctly expanding the abbreviation. We explore two approaches to this task, both leveraging large pretrained language models. The first approach is generative, where the ...
Added: March 10, 2026
Говорящий и пишущий: К 100-летию со дня рождения Татьяны Григорьевны Винокур
М.: Институт русского языка им. В.В. Виноградова РАН, 2024.
The book is dedicated to the memory of a remarkable Russian language scholar, Tatyana Grigoryevna Vinokur (1924–1992). The range of issues addressed in the collected scholarly articles reflects the breadth of Tatyana Grigoryevna's research interests: the history of language, poetics, the language of fiction, stylistics, speech culture, problems of communication studies, and many other topics. ...
Added: March 8, 2026
Правовое положение соотечественников, проживающих в постсоветских странах, в условиях нестабильной международной обстановки
Затулин К. Ф., Егоров В. Г., Докучаева А. В. et al., М.: Институт диаспоры и интеграции (Институт стран СНГ), 2025.
Книга «Правовое положение соотечественников, проживающих в постсоветских странах, в условиях нестабильной международной обстановки» содержит результаты исследования, проведенного в Абхазии, Азербайджане, Армении, Беларуси, Грузии, Казахстане, Киргизии, Латвии, Литве, Молдове, Приднестровской Молдавской Республике, Таджикистане, Узбекистане, Эстонии и Южной Осетии. Исследование выполнено Институтом диаспоры и интеграции (Институтом стран СНГ) в 2024 году. Оно включило в себя анализ нормативно-правовых ...
Added: February 3, 2026
Методика обучения младших школьников чтению на русском и английском языках: сходство и различие
[б.и.], 2022.
The article highlights the importance of the role of teaching reading to children, its specific features and components; the main methods used in teaching reading to children both in Russian and in English are considered; a comparative characteristic of the two languages is made. In addition, the article also compares the methods of teaching reading ...
Added: January 31, 2026
Semi-fake indexicals in Russian
Тискин Д. Б., Типология морфосинтаксических параметров 2025 Vol. 8 No. 1 P. 112–129
There are several rival theories of fake indexicals, i.e. bound indexicals (prominently pronouns) whose φ-features do not semantically contribute to focus alternatives (e.g. Only Mary did her homework, John didn’t do his). According to Minimal Pronoun theories (such as Kratzer’s or Wurmbrand’s), bound pronouns are Merged without φ-features and acquire them under binding via agreement-like ...
Added: January 26, 2026
Некоторые модификации к теории связанных употреблений индексальных выражений И. Басси
Тискин Д. Б., Типология морфосинтаксических параметров 2024 Т. 7 № 1 С. 107–123
Fake indexicals (FIs), or bound-variable uses of e.g. 1st - and 2 nd -person pronouns, have been analysed by Bassi (2021) as arising from a post-syntactic process of inspecting the features of the referent. This leads to a peculiar analysis of the syntax and semantics of relative clauses containing FIs. I argue for a more ...
Added: January 26, 2026
Experimental evidence suggests that null complement anaphora in Russian is not reducible to clausal ellipsis
Knyazev M., Folia Linguistica 2026 Vol. 60 No. 1 P. 453–496
Null complement anaphora, NCA (e.g., I suggested the price was too high, and she agreed ∅.), is a long known but poorly understood phenomenon subject to idiosyncratic lexical restrictions. In languages like Russian, however, it is (or appears) productive, with verbs not allowing NCA hard to nd, raising the question whether omission of the clausal argument ...
Added: January 19, 2026
Null and overt subjects in Russian polarity focus: Interactions with ellipsis
Kasenov D., Rudnev P., , in: Экспериментальные исследования языка: материалы конференции 2025.: М.: Наш мир, 2025. P. 50–53.
Added: January 19, 2026
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit