• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Аннотирование учебного корпуса в аспекте его использования для исследовательских задач
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 5, 2026
Neural Network Maps as a Method for Constructing Mathematical Models
Scientists from HSE University–Nizhny Novgorod and the Institute of Physics Belgrade, Serbia, are jointly exploring the application of machine learning techniques and neural networks to the study of nonlinear dynamics. Natalya Stankevich, Leading Research Fellow at the Laboratory of Topological Methods in Dynamics of the Faculty of Informatics, Mathematics, and Computer Science at HSE University–Nizhny Novgorod, spoke to the HSE News Service about this international project.
June 5, 2026
‘In the Age of Technology, It Is Interesting to Look into the Past and Think about What We Can Take from It
Polina Tabakova decided to apply for a Philology degree at HSE in Nizhny Novgorod because she grew up in Mari El and did not want to move far away from the Russian forests. In an interview for the Young Scientists of HSE University project, she spoke about the genre of the campus novel, the existential drama of Kolobok, and a blackout version of Eugene Onegin.
June 5, 2026
HSE Scientists Develop Method to Compress Large Language Models Without Losing Quality
Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science have developed a new compression method for large language models such as GPT and LLaMA that reduces their size by 25–36% without additional training or significant loss of accuracy. This is the first approach to use mathematical transformations—specifically, rotations of model weights—to make models more amenable to compression with structured matrices. The study results have been published in ACL Findings 2025. The code is available on GitHub.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Аннотирование учебного корпуса в аспекте его использования для исследовательских задач

С. 46–50.
Klimova M., Viklova A., Overnikova D.
Language: Russian
Full text
Keywords: экспериментальные исследованияучебный корпусlearner corpuserror annotationExperimental Studiescomputer-aided error analysisаннотирование ошибоккомпьютерный анализ ошибок
Publication based on the results of:
ONCORR - online correction for English texts (2022)

In book

Современная лингвистика: от теории к практике. III Казанский международный лингвистический саммит (Казань, 14–19 ноября 2022 г.): Труды и материалы, в трёх томах, том 1
Каз.: Издательство Казанского университета, 2022.
Similar publications
Гендерные различия в игре диктатора: сравнение поведения больших языковых моделей и людей
Parshakov P., Paklina S., Matkin N. et al., Вестник Пермского университета. Серия: Экономика 2026 Т. 21 № 1 С. 42–57
Introduction. Large language Models (LLM) are increasingly being used in social sciences to simulate the behavior of experimental participants and analyze norms of cooperation and justice. However, the question remains whether they are capable of reproducing social asymmetries, including gender differences. Goal. The work aims to test whether LLM reproduces gender differences in the Dictator ...
Added: October 27, 2025
Syntactic complexity measures as linguistic correlates of proficiency level in learner Russian
Kisselev O., Klimov A., Mihail Kopotev, , in: Complexity, Accuracy and Fluency in Learner Corpus Research. Volume vi.: Amsterdam: John Benjamins Publishing Company, 2022. Ch. 3 P. 51–80.
The study reports on the results of a corpus-based evaluation of automatically extracted syntactic complexity measures as indices of Russian as a foreign language (FL) and Russian as a heritage language (HL) writing development. A list of 12 syntactic complexity measures was tested on a set of longitudinal, classroom-based data. The analyses demonstrated that the ...
Added: November 25, 2024
Distractor Generation for Lexical Questions Using Learner Corpus Data
Nikita Login, Jazykovedny Casopis 2023 Vol. 74 No. 1 P. 345–356
Learner corpora with error annotation can serve as a source of data for automated question generation (QG) for language testing. In case of multiple choice gapfill lexical questions, this process involves two steps. The first step is to extract sentences with lexical corrections from the learner corpus. The second step, which is the focus of ...
Added: September 16, 2024
Обработка слов с частотными орфографическими ошибками (исследование на базе учебного корпуса английского языка)
Klimova M., Viklova A., Overnikova D., Вестник Санкт-Петербургского университета. Язык и литература 2023 Т. 20 № 4 С. 824–837
The article presents an experimental study of the influence of the frequency of spelling errors in a word on its representation in mental lexicon. The hypothesis that frequently misspelled words cause difficulties in reading even if they are written correctly has been proved for native speakers of Russian and English. This paper aims to check ...
Added: January 26, 2024
Устный учебный корпус РКИ: новый источник данных для лингвистических и методических исследований
Vlasova E., Бец Ю. В., Северина Е. М., В кн.: «Русская грамматика в диалоге научных школ, направлений, методов».: Владивосток: Издательство ДВФУ, 2022.
В статье анализируются нетривиальные фонетические и грамматические явления устной речи иностранцев, изучающих русский язык. Показано, что устный учебный корпус позволяет получить систематическое представление о компенсаторных механизмах речепорождения, проверять и формулировать гипотезы. ...
Added: November 8, 2023
Clausal complexity of expert and student writing: a corpus-based analysis of papers in social sciences
Smirnova E. A., Language Learning in Higher Education 2022 Vol. 12 No. 2 P. 453–475
Syntactic complexity has been extensively approached in the fields of corpus linguistics and academic discourse studies. However, works focusing on disciplinary variation in terms of linguistic complexity and comparison of professional and novice academic writing are scarce. Addressing these issues is likely to have important implications for EAP/ESP practitioners in terms of selection of target ...
Added: December 7, 2022
Review of Practices of Collecting and Annotating Texts in the Learner Corpus REALEC
Vinogradova O. I., Lyashevskaya O., , in: Text, Speech, and Dialogue. 25th International Conference, TSD 2022, Brno, Czech Republic, September 6–9, 2022, Proceedings Lecture Notes in Computer Science (LNAI), vol. 13502Vol. 13502.: Cham: Springer Publishing Company, 2022. P. 77–88.
REALEC, learner corpus released in the open access, had received 6,054 essays written in English by HSE undergraduate students in their English university-level examination by the year 2020. This paper reports on the data collection and manual annotation approaches for the texts of 2014–2019 and discusses the computer tools available for working with the corpus. ...
Added: October 5, 2022
Experimental Analysis of the Electrodynamic Characteristics of the Coaxial-Radial Line Type Slow-Wave Structures and Its Modifications
Presnyakov S., Kasatkin A., Kravchenko N. et al., , in: 2022 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO).: IEEE, 2022. P. 1–4.
In this paper, we experimentally analyze the electrodynamic characteristics of a slow-wave structure (SWS) of the coaxial-radial line (CRL) type. Such SWSs are widely used in high-power traveling-wave tubes (TWTs) provide high output powers due to good heat removal, but have the limited amplified frequency band. For analysis, the resonance method and the electron probe ...
Added: September 15, 2022
Влияние родного языка (L1) на когнитивную обработку грамматической категории рода существительных русского языка (L2) русско-тюркскими билингвами
Некрасова Е.Д., Резанова З. И., Палий В. Е., Вестник Томского государственного университета. Филология 2019 № 57 С. 103–123
The article discusses the influence of the native language grammatical structure with the missing grammatical category of gender (Turkic language) on the processing of nouns of a second language (Russian), contrasted by gender, by Russian-Turkic bilinguals. The authors test the hypothesis about the bias effect of the first language (Tatar language) on the manifestation of ...
Added: October 29, 2021
Кластеризация данных, извлечение ключевых слов и лексическое разнообразие в текстах эссе учебного корпуса
Scherbakova A., В кн.: Межкультурное пространство: лингвистический и дидактический аспекты. Материалы секций "Межкультурная лингвистика", "Межкультурная транслатология" и студенческого научного форума. Пленарное заседание и секция «Межкультурная дидактика».Ч. 2.: Издательство ПетрГУ, 2021.
The paper focuses on the task of clustering essays produced by ESL (English as a Second Language) learners. The data was taken from a learner corpus REALEC. The division of texts by certain characteristics can be useful to speed up the analysis of a single corpus or access to the necessary sections of a large ...
Added: September 30, 2021
Автоматическое обнаружение и исправление деривационных ошибок в письменной речи на русском как иностранном
Vyrenkova A. S., Смирнов И. Ю., Вестник Новосибирского государственного университета. Серия: Лингвистика и межкультурная коммуникация 2021 Т. 19 № 3 С. 57–68
Learner corpora serve as one of the most valuable sources of statistical data on learners' errors. For instance, data from foreign-language learners’ corpora can be used for the Second Language Acquisition research. However, corpora representativity strongly depends on the quality of its error markup, which is most frequently carried out manually and thus presents a ...
Added: September 24, 2021
Межъязыковая интерференция при выборе видо-временных форм английских глаголов в эссе русскоязычных студентов: корпусное исследование
Vinogradova O. I., Viklova A., В кн.: Межкультурное пространство: лингвистический и дидактический аспектыЧ. 2: Материалы секций «Межкультурная лингвистика», «Межкультурная транслатология» и студенческого научного форума.: Петрозаводск: Издательство ПетрГУ, 2021. С. 17–27.
Added: July 7, 2021
Comparative Study Of Data Clustering Algorithms And Analysis Of The Keywords Extraction Efficiency: Learner Corpus Case
Scherbakova A., / NRU HSE. Series WP BRP "Linguistics". 2020.
Added: December 2, 2020
The Fourth Saint Petersburg Winter Workshop on Experimental Studies of Speech and Language (Night Whites 2018)
[б.и.], 2018.
Added: October 21, 2020
The Bimodal Corpus of Russian-Turkic Bilinguals’ Speech
Dybo A., Rezanova Z. I., Temnikova I. G. et al., Компьютерная лингвистика и интеллектуальные технологии 2019 No. 18(25) P. 200–210
T he paper presents Russian-Turkic Bilingual Corpus (RuTuBiC) design, its basic identifying features: the aim of producing a corpus, the types of texts it contains, metatextual markup and error annotation principles, technological (IT, digital) concepts. The current state and development trends of the corpus are discussed. The corpus started as an integral part of a ...
Added: December 1, 2019
What’s in a comma: Corpus study of punctuation errors and L1 interference
Pospelova K., Viklova A., Vinogradova O. I., , in: Learner Corpus Conference. LCR 2019. Book of Abstracts.: [б.и.], 2019. P. 0–20.
TBC ...
Added: November 10, 2019
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit