Approaches to automated English essay evaluation in Russian students’ learner corpus
REALEC (Vinogradova, 2016) is the first in the open access collection of English texts (mainly essays) written by students with Russian as their native language who are learning English at the university. The project team working with the corpus over the last two years have been developing computational tools to make the use of REALEC efficient for both students and their English instructors in preparation for the university EFL examination. This paper considers four tools designed to enhance corpus-mediated work in the classroom:
• easy access to the statistics of student errors in one text, in all texts written by the same author, or in all texts in a current folder, which provides for on-the-spot feedback on the quality of the text uploaded to the corpus;
• automated evaluation of lexical proficiency, which includes commonly used features such as length of words; length of sentences; distribution of words across the Common European Framework scale levels (A1-C2); use of academic vocabulary compared with one of the two lists - the Coxhead Academic Word List and in the Corpus of Contemporary American English; number of repetitions; use of linking words; use of collocations (as attested by the comparison with the Pearson academic collocation list);
• automated test-maker, which extracts sentences from the corpus and turns them into questions for placement and progress testing purposes;
• automated evaluation of syntactic complexity of the text which takes into account features such as mean sentence depth and the average number of relative and adverbial clauses.
The opportunity to get automated evaluation of the variety of syntactic means used in a student text is an important feature for both instructors and learners.