Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue” (2015)
Distributed vector representations for natural language vocabulary get a lot of attention in contemporary computational linguistics. This paper summarizes the experience of applying neural network language models to the task of calculating semantic similarity for Russian. The experiments were performed in the course of Russian Semantic Similarity Evaluation track, where our models took from 2nd to 5th position, depending on the task. We introduce the tools and corpora used, comment on the nature of the evaluation track and describe the achieved results. It was found out that Continuous Skip-gram and Continuous Bag-of-words models, previously successfully applied to English material, can be used for semantic modeling of Russian as well. Moreover, we show that texts in Russian National Corpus (RNC) provide an excellent training material for such models, outperforming other, much larger corpora. It is especially true for semantic relatedness tasks (although stacking models trained on larger corpora on top of RNC models improves performance even more). High-quality semantic vectors learned in such a way can be used in a variety of linguistic tasks and promise an exciting field for further study.
Current trends in education, namely blended learning and computer-assisted language learning, underlie the growing interest to the task of automatically generating language exercises. Such automatic systems are especially in demand given the variability in language learning. Despite the abundance of resources for language learning, there is often a lack of specific exercises targeting a particular group of learners or ESP course. This paper gives an overview of a computer system called Exercise Maker that is aimed at flexible and versatile language exercise generation. The system supports seven exercise types, which can be generated from arbitrary passages written in English. Being able to tailor educational material to learners’ interests is known to boost motivation in learners (Heilman et al., 2010). An important feature of the system is the automatic ranking of the source passages according to their complexity/readability. As shown by expert evaluation, the automatically generated exercises are of high quality: the gap precision is about 97-98%, while the overall exercise acceptance rate varies from 90% to 97.5%. Exercise Maker is freely available for educational and research purposes.