• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Approaches to automated English essay evaluation in Russian students’ learner corpus
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 22, 2026
HSE Graduates AI Project Wins at TECH & AI Awards
Daria Davydova, graduate of the HSE Graduate School of Business and Head of the AI Implementation Unit at the Artificial Intelligence Department of Alfa-Bank, received a prize at the TECH & AI Awards. She was awarded for the best AI solution for optimising business processes. The winners were determined as part of the VII Russian Summit and Awards on Digital Transformation (CDO/CDTO Summit & Awards).
May 20, 2026
HSE University Opens First Representative Office of Satellite Laboratory in Brazil
HSE University-St Petersburg opened a representative office of the Satellite Laboratory on Social Entrepreneurship at the University of Campinas in Brazil. The platform is going to unite research and educational projects in the spheres of sustainable development, communications and social innovations.
May 18, 2026
The 'Second Shift' Is Not Why Women Avoid News
Women are more likely than men to avoid political and economic news, but the reasons for this behaviour are linked less to structural inequality or family-related stress than to personal attitudes and the emotional perception of news content. This conclusion was reached by HSE researchers after analysing data from a large-scale survey of more than 10,000 residents across 61 regions of Russia. The study findings have been published in Woman in Russian Society.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Approaches to automated English essay evaluation in Russian students’ learner corpus

P. 200–202.
Lyashevskaya O., Olga Vinogradova

REALEC (Vinogradova, 2016) is the first in the open access collection of English texts (mainly essays) written by students with Russian as their native language who are learning English at the university. The project team working with the corpus over the last two years have been developing computational tools to make the use of REALEC efficient for both students and their English instructors in preparation for the university EFL examination. This paper considers four tools designed to enhance corpus-mediated work in the classroom:

•        easy access to the statistics of student errors in one text, in all texts written by the same author, or in all texts in a current folder, which provides for on-the-spot feedback on the quality of the text uploaded to the corpus;

•        automated evaluation of lexical proficiency, which includes commonly used  features such as length of words; length of sentences; distribution of words across the Common European Framework scale levels (A1-C2); use of academic vocabulary compared with one of the two lists - the Coxhead Academic Word List and in the Corpus of Contemporary American English; number of repetitions; use of linking words; use of collocations (as attested by the comparison with the Pearson academic collocation list);

•        automated test-maker, which extracts sentences from the corpus and turns them into questions for placement and progress testing purposes;

•       automated evaluation of syntactic complexity of the text which takes into account features such as mean sentence depth and the average number of relative and adverbial clauses. 

The opportunity to get automated evaluation of the variety of syntactic means used in a student text is an important feature for both instructors and learners. 

Language: English
Full text
Text on another site
Keywords: corpus researchcorpus annotationlearner corporacomputerized adaptive testing
Publication based on the results of:
Лексикологические исследования на базе учебного корпуса REALEC (Learner corpus REALEC: Lexicological observations) (2016)

In book

4th Learner Corpus Conference. LCR 2017. Book of Abstracts
4th Learner Corpus Conference. LCR 2017. Book of Abstracts
Bozen: [б.и.], 2017.
Similar publications
Distractor Generation for Lexical Questions Using Learner Corpus Data
Nikita Login, Jazykovedny Casopis 2023 Vol. 74 No. 1 P. 345–356
Learner corpora with error annotation can serve as a source of data for automated question generation (QG) for language testing. In case of multiple choice gapfill lexical questions, this process involves two steps. The first step is to extract sentences with lexical corrections from the learner corpus. The second step, which is the focus of ...
Added: September 16, 2024
L1 Influence on the Use of the English Present Perfect: A Corpus Analysis of Russian and Spanish Learners’ Essays
Perez-Guerra J., Smirnova E. A., Journal of Language and Education 2024 Vol. 10 No. 1 P. 101–114
Mastering verbal tenses, especially those expressing aspect, in a second language presents a challenge as learners frequently link the semantic nuances of verbal forms in their second language (L2) to the characteristics of the verbal systems in their native languages (L1). This study explores the impact of L1 on the usage of the English Present ...
Added: March 3, 2024
Review of Practices of Collecting and Annotating Texts in the Learner Corpus REALEC
Vinogradova O. I., Lyashevskaya O., , in: Text, Speech, and Dialogue. 25th International Conference, TSD 2022, Brno, Czech Republic, September 6–9, 2022, Proceedings Lecture Notes in Computer Science (LNAI), vol. 13502Vol. 13502.: Cham: Springer Publishing Company, 2022. P. 77–88.
REALEC, learner corpus released in the open access, had received 6,054 essays written in English by HSE undergraduate students in their English university-level examination by the year 2020. This paper reports on the data collection and manual annotation approaches for the texts of 2014–2019 and discusses the computer tools available for working with the corpus. ...
Added: October 5, 2022
Word-formation complexity: a learner corpus-based study
Lyashevskaya O., Pyzhak J.V., Vinogradova O. I., Russian Journal of Linguistics 2022 Vol. 26 No. 2 P. 471–492
This article explores the word-formation dimension of learner text complexity which indicates how skilful the non-native speakers are in using more and less complex - and varied - derivational constructions. In order to analyse the association between complexity and writing accuracy in word formation as well as interactive effects of task type, text register, and ...
Added: October 5, 2022
Рragmatic Markers in the Corpus “Оne Day of Speech”: Approaches to the Annotation
Zaides K., Popova T., Bogdanova-Beglarian Natalia, , in: Proceedings of Computational Models in Language and Speech Workshop (CMLS 2018) co-located with the 15th TEL International Conference on Computational and Cognitive Linguistics (TEL-2018)Vol. 2303: Computational Models in Language and Speech 2018.: Kazan: CEUR Workshop Proceedings, 2018. P. 128–143.
Added: February 3, 2022
Об унификации разметки корпуса «Сбалансированная аннотированная текстотека»
Zaides K., В кн.: Труды международной конференции «Корпусная лингвистика-2019».: Издательство Санкт-Петербургского государственного университета, 2019. С. 332–339.
Доклад посвящен процессу и результатам унификации разметки корпуса «Сбалансированная аннотированная текстотека». Данный корпус состоит из нескольких отдельных блоков, репрезентирующих устную речь представителей разных социальных и психологических групп. Для дальнейших лингвистических исследований, а также в целях сравнения данных, полученных на материале иных корпусов, необходимо было унифицировать систему разметки корпуса. На текущем этапе производилась замена основных знаков транскрипции, отмечающих особые явления, свойственные ...
Added: February 3, 2022
К вопросу о формировании набора отношений для корпуса с дискурсивной разметкой текста
Соколова Е. Г., Toldova S., Компьютерная лингвистика и вычислительные онтологии 2020 № 4 С. 44–53
The work discusses the problem of discourse annotation and the consequences of the relations set simplification for the sake of higher interannotator agreement. One of the theoretical approaches to discourse structure representation is the Rhetoric Structure Theory by William Mann and Sandra Thompson [1]. There is a set of rhetoric relations between discourse units that ...
Added: November 17, 2021
Discourse features of blogs in subcorpus of Russian Ru-RSTreebank
Toldova S., Davydova T., Kobozeva M. et al., , in: Компьютерная лингвистика и интеллектуальные технологии: по материалам ежегодной международной конференции «Диалог» (Москва, 17–20 июня 2020 г.)Issue 19(26): дополнительный том.: -, 2020. P. 747–761.
The paper presents a corpus study of the discourse features in the corpus of blogs. It is based on the data of Ru-RSTreebank annotated within the framework of the Rhetorical Structure theory [Mann, Thompson 1988]. The Ru-RSTreebank represents genres of news and popular science, scientific papers, and blogs texts. Blog subcorpus contains such topics as ...
Added: November 17, 2021
Using an Error-Annotated Learner Corpus (REALEC) in DDL Lessons
M. A. Klimova, V. K. Smilga, D. A. Overnikova, , in: Труды международной конференции «Корпусная лингвистика–2021».: Скифия-принт, 2021. P. 112–121.
Added: October 31, 2021
Время ответа в компьютерном адаптивном тестировании
Federiakin D., В кн.: Информатизация образования и методика электронного обучения: цифровые технологии в образованииТ. 2.: Красноярск: Библиотечно-издательский комплекс Сибирского федерального университета, 2020. Гл. 45 С. 249–255.
The paper describes the ways to develop a computerized adaptive test using item response times as collateral information. The paper shows that introducing item response times in the measurement model has the same effect on the reliability of computerized adaptive tests as on the reliability of linear tests. Nonetheless, the presence of missing responses may ...
Added: October 28, 2020
Hedges in Russian EAP writing: A corpus-based study of research papers in management
Smirnova E. A., Стринюк С. А., Journal of English as a Lingua Franca 2020 Vol. 9 No. 1 P. 81–101
The fact that English has become a lingua franca of academic communication has led to increased attention to teaching English for academic purposes (EAP) at the academia. Academic discourse markers, such as hedges, have been an important topic in academic writing research whose prime aim is helping non-Anglophone researchers to present their research findings in ...
Added: October 14, 2020
De profundis: проблемы глубокой разметки мультимедийного русского корпуса и пути решения
Переверзева С. И., Ермолаева Н. А., Zueva A. et al., Труды института русского языка им. В.В. Виноградова 2019 № 21 С. 319–325
The paper focuses on the manual gesture annotation in the Multimodal Russian Corpus (MURCO), which was started up by E.A. Grishina and is continued by the authors of this paper. The important idea of the annotation process is the attempt to provide “the uniformity and commonality of the markup” [Grishina 2010] to the maximum degree ...
Added: April 27, 2020
Contrast and comparison relations in RST framework
Toldova S., Davydova T., Kobozeva M. et al., , in: Computational Linguistics and Intellectual Technologies Papers from the Annual International Conference “Dialogue” (2019)Issue 18.: M.: Russian State University for the Humanitie, 2019. P. 714–727.
The paper is devoted to a corpus study of the Contrast relation between discourse units in Russian. It is based on the data of the Ru-RSTreebank annotated within the framework of the Rhetorical Structure theory [Mann, Thompson 1988]. The research question is what cue phrases and lexical and grammatical patterns are used to express the ...
Added: April 22, 2020
Punctuation in L2 English: Computational Methods Applied in the Study of L1 Interference
Vinogradova O. I., Viklova A., Smilga V., , in: Emerging Writing Research from the Russian Federation.: WAC Clearinghouse, University Press, Colorado, 2021. Ch. 9 P. 211–233.
Added: February 4, 2020
POS tagger evaluation for the automated text analysis and identification of learner error
Vinogradova O. I., Buzanov A., Генералова С. А. et al., , in: ПРОСТРАНСТВО НАУЧНЫХ ИНТЕРЕСОВ: ИНОСТРАННЫЕ ЯЗЫКИ И МЕЖКУЛЬТУРНАЯ КОММУНИКАЦИЯ - СОВРЕМЕННЫЕ ВЕКТОРЫ РАЗВИТИЯ И ПЕРСПЕКТИВЫВып. 3.: Буки Веди, 2019. Ch. 6 P. 44–49.
Working with learner corpora requires elaborate NLP techniques such as POS-annotation. In this article a team of computational linguists presents their experience of choosing a POS-tagger for precise and effortless annotation of .txt files with Python3. Russian Error-Annotated Learner English Corpus (REALEC) is the underlying corpora to which text features the POS-tagger has to respond. ...
Added: December 28, 2019
Проблемы разметки корпуса текстов на русском языке в терминах теории риторических структур: из опыта создания ru-rstreebank
Toldova S., Кобозева М. В., Тугутова А. А. et al., В кн.: Труды международной конференции "Корпусная лингвистика - 2019".: СПб.: Издательство Санкт-Петербургского университета, 2019. С. 120–126.
The work is devoted to different aspects of the Russian discourse treebank annotation. We discuss different issues of the procedure and different difficulties we came across in the process of adaptation of the RST theory to the Russian data of News texts. ...
Added: November 25, 2019
Особые свойства риторических отношений "контраст" и "сравнение" на материале разметки в корпусе Ru-Rstreebank
Соколова Е. Г., Toldova S., В кн.: Труды международной конференции "Корпусная лингвистика - 2019".: СПб.: Издательство Санкт-Петербургского университета, 2019. С. 127–133.
The work is devoted to the detection of the Contrast vs. Comparison relations within the framework of the Rhetoric structure theory Mann-Thomson. The analysis of annotated data in terms of logical or pragmatic constraints is suggested. This analysis makes it possible to suggest some operational criteria for the relations under discussion. These criteria together with ...
Added: November 25, 2019
Automated assessment of learner text complexity
Lyashevskaya O., Irina Panteleeva, Olga Vinogradova, Assessing Writing 2021 No. 49 Article 100529
EFL methodology has always recognized the importance of giving student learners of foreign languages regular and quick feedback on student speech production, both written and oral, and over the past two decades there appeared various tools for the provision of automated instant feedback. The presented paper offers an application that focuses on measuring text complexity, ...
Added: October 20, 2019
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit