Нестандартная русская речь: корпусные технологии в исследовании и методике преподавания

Я. Э. Ахапкина; Н. Н. Буйлова

Publications

?

Нестандартная русская речь: корпусные технологии в исследовании и методике преподавания

С. 48–51.

Akhapkina Y., Builova N.

Language: Russian

Full text

Keywords: корпус несовершенных переводов аннотация корпуса корпус эритажных текстов корпус ошибок корпус учебных текстов корпус языка ХIХ века

In book

Проблемы преподавания курса "Русский язык и культура речи в вузах"

М.: Общество с ограниченной ответственностью "Научный консультант", 2016.

Псевдосинонимичные русские конструкции у X-а Y и у X-а есть Y в контексте изучения русского языка

Apresyan V., В кн.: XVII Апрельская международная научная конференция по проблемам развития экономики и общества: в 4 кн.Кн. 4. М.: Издательский дом НИУ ВШЭ, 2017. С. 369–379.

Использование посессивных конструкций с нулевым предикатом и со словоформой есть регулируется рядом семантических, прагматических и коммуникативных правил. Конструкция у X-а есть Y маркирована семантически, коммуникативно и сочетаемостно, она предполагает: противопоставление наличия отсутствию (У меня есть друзья), противопоставление наличия объектов одного типа объектам другого типа (У него есть хорошие студенты), рематизацию глагола (У меня ЕСТЬ мнение); связанные ...

Added: November 30, 2016

Russian Learner Parallel Corpus as a Tool for Translation Studies

Kutuzov A. B., Kunilovskaya M. A., Oschepkov A. et al., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной Международной конференции «Диалог» (Бекасово, 30 мая–3 июня 2012 г.). В 2 томахТ. 1: Основная программа конференции. Вып. 11. М.: Российский государственный гуманитарный университет, 2012. P. 362–369.

The paper presents a project aimed at the development of a Russian Learner Parallel Corpus, discusses the existing analogues, describes the current status and the tasks in which it could be used. The existing parallel corpora contain (comparatively) “correct” translations; whereas the aim of the present project is to create a sufficiently large corpus of ...

Added: February 13, 2013

Русские посессивные конструкции с нулевым и выраженным глаголом: правила и ошибки

Apresyan V., Русский язык в научном освещении 2017 № 33 С. 86–116

Статья посвящена псевдосинонимичным посессивным конструкциям у Х-а есть Y и У Х-а Y. В статье рассматриваются правила их употребления и их сравнительная трудность для усвоения иностранными студентами. Исследование проводилось на корпусе ошибок RULEC. Были получены следующие результаты: ошибок в употреблении посессивных конструкций у продвинутых студентов вообще встретилось немного (не более 10 процентов), из чего ...

Added: November 30, 2016

How inter-annotator agreement helps to improve error annotation schemes in learner corpora

Fenogenova A., Kuzmenko E., Olga Vinogradova, , in: TaLC 12 - Teaching and Language Corpora Conference. [б.и.], 2016. P. 30–34.

The scope and the level of change suggested by an annotator cannot be formally defined, and besides, it is not often that two persons - native speakers or fluent speakers of a foreign language – will not differ in their intuitive perception of what is acceptable in the language. However, if annotators stick to the ...

Added: December 11, 2016

Coreference in Russian Oral Movie Retellings (the Experience of Coreference Relations Annotation in “Russian CliPS ” corpus)

Toldova S. Yu., Bergelson M. B., Khudyakova M. V., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва,1–4 июля 2016 г.)Вып. 15. М.: Изд-во РГГУ, 2016. P. 769–781.

The work deals with adapting the Russian coreference corpus RuCor annotation system (used for written Russian) to the corpus of Russian oral narratives from the Russian Clinical Pear Stories Corpus (Russian CliPS) (Khudyakova et al., 2016). Russian CLiPS is a corpus of Russian “Pear stories” movie (Chafe, 1980) retellings in clinical populations as compared to ...

Added: June 6, 2016

Pre-experiments on Annotation of Russian Coreference Corpus

Toldova S., Azerkovich I., Гришина Ю. et al., / NRU HSE. Series WP BRP "Linguistics". 2015.

Building benchmark corpora in the domain of coreference and anaphora resolution is an important task for developing and evaluating NLP systems and models. Our study is aimed at assessing the feasibility of enhancing corpora with information about coreference relations. The annotation procedure includes identification of text segments that are subjects to annotation (markables), marking their ...

Added: December 15, 2015

Discourse features of blogs in subcorpus of Russian Ru-RSTreebank

Toldova S., Davydova T., Kobozeva M. et al., , in: Компьютерная лингвистика и интеллектуальные технологии: по материалам ежегодной международной конференции «Диалог» (Москва, 17–20 июня 2020 г.)Issue 19(26): дополнительный том. -, 2020. P. 747–761.

The paper presents a corpus study of the discourse features in the corpus of blogs. It is based on the data of Ru-RSTreebank annotated within the framework of the Rhetorical Structure theory [Mann, Thompson 1988]. The Ru-RSTreebank represents genres of news and popular science, scientific papers, and blogs texts. Blog subcorpus contains such topics as ...

Added: November 17, 2021

Степени сравнения в свете русской грамматики ошибок

Rakhilina E. V., Труды института русского языка им. В.В. Виноградова 2015 № 6 С. 310–333

На корпусном материале в статье рассматриваются распространенные ошибки в русской конструкции сравнения — с теоретической и типологической точек зрения. Используются данные корпуса Академического письма и Учебного корпуса, содержащего тексты изучающих русский язык как иностранный. Выявлены «слабые места» и «точки роста» этой конструкции. Показано, что наблюдаемая вариативность и изменения в узусе в целом ожидаемы: они упрощают ...

Added: March 2, 2016

Исследовательский портал для анализа и оценки стиля научных публикаций

Shuchalova Y., Lanin V., Информационные технологии 2018 Т. 24 № 8 С. 515–523

Описан этап проектирования портала для проведения корпусных исследований английского языка. Сформулированы требования к решению, показаны лингвистические подходы к решению поставленных задач. Приведен процесс моделирования системы и рассмотрены особенности реализации с учетом специфики предметной области. Для интеграции гетерогенных компонентов предложена сервисная архитектура. ...

Added: December 14, 2017

Russian CliPS: a Corpus of Narratives by Brain-Damaged Individuals

Khudyakova M., Bergelson M., Akinina Y. et al., , in: Proceedings of the Tenth conference on International Language Resources and Evaluation (LREC'16), Portoroz, Slovenia : ELRA, 2016. [б.и.], 2016. P. 22–26.

In this paper we present a multimedia corpus of Pear film retellings by people with aphasia (PWA), right hemisphere damage (RHD), and healthy speakers of Russian. Discourse abilities of brain-damaged individuals are still under discussion, and Russian CliPS (Clinical Pear Stories) corpus was created for the thorough analysis of micro- and macro-linguistic levels of narratives by PWA ...

Added: October 13, 2016

Особые свойства риторических отношений "контраст" и "сравнение" на материале разметки в корпусе Ru-Rstreebank

Соколова Е. Г., Toldova S., В кн.: Труды международной конференции "Корпусная лингвистика - 2019". СПб.: Издательство Санкт-Петербургского университета, 2019. С. 127–133.

The work is devoted to the detection of the Contrast vs. Comparison relations within the framework of the Rhetoric structure theory Mann-Thomson. The analysis of annotated data in terms of logical or pragmatic constraints is suggested. This analysis makes it possible to suggest some operational criteria for the relations under discussion. These criteria together with ...

Added: November 25, 2019

Проблемы разметки корпуса текстов на русском языке в терминах теории риторических структур: из опыта создания ru-rstreebank

Toldova S., Кобозева М. В., Тугутова А. А. et al., В кн.: Труды международной конференции "Корпусная лингвистика - 2019". СПб.: Издательство Санкт-Петербургского университета, 2019. С. 120–126.

The work is devoted to different aspects of the Russian discourse treebank annotation. We discuss different issues of the procedure and different difficulties we came across in the process of adaptation of the RST theory to the Russian data of News texts. ...

Added: November 25, 2019

Daba: a model and tools for Manding corpora

Kirill Maslinsky, , in: TALN-RECITAL 2014 Workshop TALAf 2014 : Traitement Automatique des Langues Africaines (TALAf 2014: African Language Processing). Marseille: Association pour le Traitement Automatique des Langues, 2014. P. 114–122.

This article provides a brief overview of Daba software package created in the course of building corpora for Manding languages. Key software features are motivated by the tasks and problems characteristic of many African languages. The corpus-building model proposed here was initially developed for Bambara Reference Corpus which is available online and is freely accessible. ...

Added: March 26, 2015