• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Articles
  • Применение вычислительных методов корпусного анализа к исследованию текстов литературных произведений
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 5, 2026
Neural Network Maps as a Method for Constructing Mathematical Models
Scientists from HSE University–Nizhny Novgorod and the Institute of Physics Belgrade, Serbia, are jointly exploring the application of machine learning techniques and neural networks to the study of nonlinear dynamics. Natalya Stankevich, Leading Research Fellow at the Laboratory of Topological Methods in Dynamics of the Faculty of Informatics, Mathematics, and Computer Science at HSE University–Nizhny Novgorod, spoke to the HSE News Service about this international project.
June 5, 2026
‘In the Age of Technology, It Is Interesting to Look into the Past and Think about What We Can Take from It
Polina Tabakova decided to apply for a Philology degree at HSE in Nizhny Novgorod because she grew up in Mari El and did not want to move far away from the Russian forests. In an interview for the Young Scientists of HSE University project, she spoke about the genre of the campus novel, the existential drama of Kolobok, and a blackout version of Eugene Onegin.
June 5, 2026
HSE Scientists Develop Method to Compress Large Language Models Without Losing Quality
Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science have developed a new compression method for large language models such as GPT and LLaMA that reduces their size by 25–36% without additional training or significant loss of accuracy. This is the first approach to use mathematical transformations—specifically, rotations of model weights—to make models more amenable to compression with structured matrices. The study results have been published in ACL Findings 2025. The code is available on GitHub.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Применение вычислительных методов корпусного анализа к исследованию текстов литературных произведений

Труды Института системного анализа Российской академии наук. 2024. Т. 74. № 2. С. 25–32.
Аванесян Н. Л., Губина О. В., Chepovskiy A.

This article is devoted to the application of corpora analysis mathematical methods for the research  of Russian fiction texts. A corpus of prose texts of Russian XIX century fiction, consisting of five subcorpora,  has been created for the research. Each subcorpora contains texts of one certain author. Using the example of  the created corpora, the possibilities of using the correspondence analysis method integrated into the TXM platform as one of the tools of the statistical research method are demonstrated. As another method, we consider the  analysis of pairwise rank correlation coefficients to compare the frequency characteristics of texts of different  subcorps. The methods described give correlated results and make it possible to identify differentiating features.  The methods described give correlated results and make it possible to identify differentiating features. The described method can be used both for linguistic and literary studies and for creating appropriate training text sets  for artificial intelligence tasks.

Research target: Computer Science
Language: Russian
Full text
DOI
Text on another site
Keywords: корпусная лингвистикаcorpus linguisticsкорреляционный анализcorrespondence analysisанализ соответствий correlation analysisTXM platformплатформа TXM
Similar publications
Proceedings of the 43rd International Conference on Machine Learning (ICML 2026)
Seul: PMLR, 2026.
Added: June 4, 2026
Syntactic functions of non-manuals in Russian Sign Language
Burkova S., Khristoforova E., Kimmelman V., , in: Advances in Sign Language Corpus Linguistics.: John Benjamins Publishing Company, 2023. P. 90–129.
This chapter presents the Russian Sign Language (RSL) Corpus and demonstrates its capabilities as a research tool by summarizing three corpus-based studies primarily focused on syntactic functions of nonmanual markers. The first study considers question marking in regular wh-questions and in question-answer pairs. It shows that the two constructions have very different nonmanual markers. The second study analyzes marking of ...
Added: June 3, 2026
OpenAtom Foundation. Консорциум, развивающий Open Source в Китае.
Silakov D., Системный администратор 2026 № 3 С. 28–33
В статье про платформы для разработки открытого ПО в Китае мы рассказали про GitCode – молодой проект, позиционируемый как площадка для разработчиков со всего мира. Сейчас на GitCode размещаются проекты, созданные в КНР, но некоторые из них уже известны и на международной арене. Помочь открытым проектам в становлении, развитии и расширению аудитории призван фонд OpenAtom ...
Added: June 2, 2026
The recognition-by-components method
Slivnitsin P., Mylnikov L., Engineering Applications of Artificial Intelligence 2026 Vol. 179 Article 115185
The paper describes a applied artificial intelligence task of recognition-by-components method of real objects based on the recognition of a limited set of primitives or components. The recognition-by-components makes it possible to determine the components, that compose an object, and increase the number of recognizable objects without degrading the recognition quality. Training is performed on ...
Added: May 29, 2026
Brain-Computer Interfaces for Gait Rehabilitation After Stroke A Scoping Review
Mokienko O., Zisman M. A., Bobrov P. et al., American Journal of Physical Medicine and Rehabilitation 2026 Vol. 105 No. 6 P. 555–563
Brain-computer interfaces (BCIs) represent a promising technology for restoring lower limb motor functions and gait after stroke. The application of BCIs in this field is supported by a limited number of studies. The objective of the review was to systematically and critically evaluate the current evidence on the use of BCIs for lower limb function ...
Added: May 28, 2026
Generalizing the Brady-Yong Algorithm: Efficient Fast Hough Transform for Arbitrary Image Sizes
Kazimirov D., Rybakova E., Vitalii V. Gulevskii et al., IEEE Access 2025 Vol. 13 P. 20101–20132
The Hough (discrete Radon) transform (HT/DRT) is a digital image processing tool that has become indispensable in many application areas, ranging from general image processing to neural networks and X-ray computed tomography. The utilization of the HT in applied problems demands its computational efficiency and increased accuracy. The de facto standard algorithm for the fast ...
Added: May 28, 2026
Universal Comparison Methodology for Hough Transform Approaches
Kazimirov D., Vitalii Gulevskii, Kroshnin A. et al., Mathematics 2026 Article 1136
The Hough transform (HT) is widely used in computer vision, tomography, and neural networks. Numerous algorithms for HT computation have been proposed, making their systematic comparison essential. However, existing comparative methodologies are either non-universal and limited to certain HT formulations, or task-oriented, relying on application-specific criteria that do not fully capture algorithmic properties. This paper ...
Added: May 28, 2026
ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ И ТЕХНИЧЕСКИЕ СРЕДСТВА УПРАВЛЕНИЯ (ICCT-2024)
М.: Институт проблем управления им. В.А. Трапезникова РАН, 2024.
В сборник вошли материалы VIII Международной научной конференции «Информационные технологии и технические средства управления» (ICCT-2024). На конференции были рассмотрены вопросы, касающиеся перспектив развития научного приборостроения в телекоммуникационных и управляющих системах, биомедицинской информатики, аппаратного и программного обеспечения информационнокоммуникационных систем, надежности, диагностики и неразрушающего контроля, систем управления и автоматизации, цифровых экосистем, управления производством и логистикой, методов математического ...
Added: May 27, 2026
Non-linear in-band interference cancellation on base of conjugate gradients method
Degtyarev A., Bakhurin S., Yudin N., DSPA 2026 P. 1–6
This paper investigates one possible solution to the problem of self-interference cancellation (SIC) arising in the design of in-band full-duplex (IBFD) communication systems. Self-interference cancellation is performed in the digital domain using multilayer nonlinear models adapted via gradient-based optimization. The presence of local minima and saddle points during the adaptation of multilayer models limits the ...
Added: May 26, 2026
28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, Italy – Including 14th Conference on Prestigious Applications of Intelligent Systems (PAIS 2025)
IOS Press, 2025.
Added: May 26, 2026
Comparative Study of Training Methods and Architectures of Echo State Networks
Androsov I., Proceedings of the Institute for System Programming of the RAS 2026 Vol. 38 No. 3 P. 87–114
This paper examines echo state networks (ESNs), one of the most prevalent approaches to implementing reservoir computing. An ESN consists of a recurrent neural network with fixed (untrained) weights and a readout layer that is typically linear and trainable. This approach enables the creation of energyefficient and computationally efficient neural networks capable of real-time learning. However, since ...
Added: May 26, 2026
Рефакторинг исходного кода на основе LLM и расширения UML
Караваева Е. А., Кулигин Л. А., Rezunik L. et al., Труды Института системного программирования РАН 2026 Т. 38 № 3 С. 67–94
В статье представлен метод рефакторинга исходного кода на основе интеграции большой языковой модели (LLM) и расширенной UML-модели программного кода. Предложенный подход позволяет выявлять проблемные участки кода с использованием функций тревожности и структурных метрик классов, а затем выполнять автоматизированный рефакторинг. Ключевой особенностью метода является использование LLM для генерации формальных спецификаций на языке OCL (Object Constraint Language), ...
Added: May 24, 2026
Coping with AI errors with provable guarantees
Tyukin I., Tyukina T., van Helden D. P. et al., Information Sciences 2024 Vol. 678 Article 120856
AI errors pose a significant challenge, hindering real-world applications. This work introduces a novel approach to cope with AI errors using weakly supervised error correctors that guarantee a specific level of error reduction. Our correctors have low computational cost and can be used to decide whether to abstain from making an unsafe classification. We provide ...
Added: May 23, 2026
Overcoming the Curse of Dimensionality with Synolitic AI
Zaikin A., Sviridov I., Sosedka A. et al., Technologies 2026 Vol. 14 No. 2 Article 84
High-dimensional tabular data are common in biomedical and clinical research, yet conventional machine learning methods often struggle in such settings due to data scarcity, feature redundancy, and limited generalization. In this study, we systematically evaluate Synolitic Graph Neural Networks (SGNNs), a framework that transforms high-dimensional samples into sample-specific graphs by training ensembles of low-dimensional pairwise ...
Added: May 23, 2026
Stable On-the-Fly Learning for Dynamic Neural Networks With Delayed Inputs
Chertopolokhov V., Mukhamedov A., Bugriy G. et al., IEEE Access 2026 Vol. 14 P. 14369–14392
This study presents on-the-fly identification and multi-step prediction of nonlinear systems with delayed inputs using a dynamic neural network combined with a smooth projection onto ellipsoids. The projection enforces parameter constraints that guarantee stability, while a Lyapunov–Krasovskii analysis yields computable ultimate error bounds. Riccati-type matrix inequalities are derived, providing an efficient vectorization–projection–devectorization implementation suitable for ...
Added: May 22, 2026
Опыт применения сетевого анализа (SNA) в историческом нарративе полисубъектного региона (на примере валлийской хроники Brut y Tywysogyon)
Loshkareva M. E., Matveeva N., Вестник Томского государственного университета. История 2026 № 100 С. 112–118
This research is an endeavor to apply social network analysis (SNA) to the study of a medieval narrative source. The authors suppose that the use of network analysis may offer new possibilities in the study of the history of regions characterized by some political fragmentation. Authors tried to construct networks of historical interactions from 1193 ...
Added: May 22, 2026
Reproducible Benchmark of Wavelet-Enhanced Intrabody Communication Biometric Identification
Jin S., Komarov M. M., Scientific Reports 2026
Intrabody communication (IBC) channels offer physiological diversity that can be leveraged for passive biometric identification in wearable devices. Recent reports of over 99 per cent identification accuracy have frequently resulted from data leakage, where samples from the same subject are seen in both training and evaluation, yielding inflated and unreliable metrics. In this work, we ...
Added: May 21, 2026
ML-based Fast Simulation of FARICH Responses
Shipilov F., Barnyakov A., Ivanov A. et al., / Series Physics "arxiv.org". 2026.
A fast simulation of the detector response is a vital task in high-energy physics (HEP). Traditional Monte-Carlo methods form the backbone of modern particle physics simulation software but are computationally expensive. We present a machine-learning-based approach to fast simulation of the Focusing Aerogel Ring Imaging Cherenkov (FARICH) detector response. Given a particle track and momentum, ...
Added: May 19, 2026
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Rabat: Association for Computational Linguistics, 2026.
Added: May 19, 2026
Focus on vocabulary. Экономика материальных и нематериальных активов: корпусный словарь и ИИ-упражнения по английскому языку
Gorina O. G., Kucherenko S., Larisa K. et al., СПб.: Астерион, 2026.
This textbook is an integrated teaching and learning resource for English for Specific Purposes (ESP) in the field of economics of tangible and intangible assets. Its design employs (i) modern corpus linguistics methods, including frequency analysis and keyword extraction based on authentic texts reflecting current trends in professional discourse, and (ii) artificial intelligence technologies for ...
Added: May 16, 2026
Российская социология в условиях цифровизации общества: результаты анализа корпуса научных текстов
Smirnov A., Социологические исследования 2023 № 4 С. 39–50
Using the analysis of a corpus of texts from eight leading Russian sociological journals, the article examines the impact of the digitalization of society on sociology in 2000–2021. Frequency analysis of 13.8 thousand scientific texts tracked the introduction of concepts related to digitalization into academic circulation. The article reveals the differences between the journals, due ...
Added: March 18, 2026
Promotional adjectives in grant proposal abstracts: a corpus study
Dmitriy S. Tulyakov, Tatiana M. Permyakova, Ekaterina A. Balezina, Вестник Волгоградского государственного университета. Серия 2: Языкознание 2025 Vol. 24 No. 6 P. 58–67
By effectively integrating promotional discourse into grant proposal abstracts, researchers can more compellingly present their ideas and increase their chances of securing funding. Implications of promotional adjectives in grant writing might differ across various research fields. This study aims to explore the use of promotional adjectives in abstracts of research grant proposals in six research ...
Added: March 2, 2026
«Звезды рекомендуют весам пить сливовое вино»: исследование астрологического дискурса на основе распределений частотной лексики и сентимент-анализа
Kirina M., Лукьянчикова А. С., В кн.: Язык в эпоху цифровых трансформаций и развития искусственного интеллекта : Сборник научных статей по итогам II Международной научной конференции Минск, 23–24 октября 2025 г.: Мн.: БГУИЯ, 2025. С. 74–85.
В статье рассматриваются характерные особенности гороскопических текстов как части астрологического дискурса. Материалом исследования выступает представительная выборка ежедневных предсказаний на русском языке, опубликованных в открытых группах социальной сети «ВКонтакте», суммарным объемом 1185425 словоупотреблений. С использованием методов корпусной и компьютерной лингвистики анализируются содержательные лексические единицы – как общие, так и отличительные для каждого знака зодиака (в сопоставлении ...
Added: February 28, 2026
Динамика восприятия площадей в пространстве города носителями русского языка (сравнительный анализ по данным НКРЯ)
Belova P., В кн.: Актуальные вопросы лингвистики и литературоведения: сборник научных статей по материалам международной научной конференции памяти доктора филологических наук, профессора Л.А. Араевой (6–8 февраля 2025).: Кемеровский государственный университет, 2025. С. 155–160.
This article contains research results on the dynamics of squares’ perception in the city space in the Russian language picture of the world over time, starting from the second half of the XXth century to the present. Turning to the subcorpus of literary texts of the second half of the XXth century and the XXIst ...
Added: February 4, 2026
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit