RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

T. Shavrina; A. Fenogenova; Emelyanov A.; Shevelev D.; E. Artemova; Malykh V.; V. Mikhailov; Tikhonova M.; Chertok A.; Evlampiev A.

?

RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

P. 4717–4726.

Shavrina T., Fenogenova A., Emelyanov A., Shevelev D., Artemova E., Malykh V., Mikhailov V., Tikhonova M., Chertok A., Evlampiev A.

Language: English

Full text

Text on another site

Keywords: natural language processing Russian language benchmarks BERT Language Model Artificial General Intelligence (AGI)

In book

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Association for Computational Linguistics, 2020.

DaNetQA: a yes/no Question Answering Dataset for the Russian Language

Glushkova T., Machnev A., Fenogenova A. et al., , in: Analysis of Images, Social Networks and Texts: 9th International Conference, AIST 2020, Skolkovo, Moscow, Russia, October 15–16, 2020, Revised Selected PapersVol. 12602. Springer, 2021. P. 57–68.

Added: November 22, 2020

Applying statistical tagging to Russian poetry

Starchenko A., Kazakevich L., Lyashevskaya O., / NRU HSE. Series WP BRP "Linguistics". 2018. No. 76.

The poetic texts pose a challenge to full morphological tagging and lemmatization since the authors seek to extend the vocabulary, employ morphologically and semantically deficient forms, go beyond standard syntactic templates, use non-projective constructions and non-standard word order, among other techniques of the creative language game. In this paper we evaluate a number of probabilistic ...

Added: December 12, 2018

Computational Linguistics and Intellectual Technologies Papers from the Annual International Conference “Dialogue” (2019)

M.: Russian State University for the Humanitie, 2019.

The book includes 64 papers submitted to the International conference in computer linguistics and intellectual technologies Dialogue 2019 and presents a broad spectrum of theoretical and applied research of natural language description, language simulation, and creation of applied computer technologies. ...

Added: October 16, 2019

Universal Dependencies for Russian: A New Syntactic Dependencies Tagset

Lyashevskaya O., Droganova K., Zeman D. et al., / NRU HSE. Series WP BRP "Linguistics". 2016. No. 44.

This paper presents the Universal Dependencies tagset (UD v1) as a new annotation scheme for Russian treebanks. The universal list of dependency relations was adopted and extended to comply with certain language-specific syntactic constructions. The tagset was validated, converting two Russian treebanks into the UD format, UD-Russian-SynTagRus and UD-Russian-Google. ...

Added: December 14, 2016

Automatic generation of reviews of scientific papers

Nikiforovskaya A., Kapralov N., Vlasova A. et al., , in: 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA 2020). Miami: IEEE, 2020. P. 314–319.

With an ever-increasing number of scientific papers published each year, it becomes more difficult for researchers to explore a field that they are not closely familiar with already. This greatly inhibits the potential for cross-disciplinary research. A traditional introduction into an area may come in the form of a review paper. However, not all areas ...

Added: December 28, 2020

The democratization of Russian

Mustajoki A., , in: The Soft Power of the Russian Language: Pluricentricity, Politics and Policies (Studies in Contemporary Russia). Routledge, 2020. P. 21–34.

People are not born with equal opportunities. This also concerns their linguistic capabilities. At the same time, human language is, by its very nature, democratic in the sense that all children learn their mother tongue in a similar way – by imitating the speech of their parents and building on that basis their own personal ...

Added: December 10, 2020

The Advantages of Human Evaluation of Sociomedical Question Answering Systems

Фирсанова В. И., International Journal of Open Information Technologies 2021 Vol. 9 No. 12 P. 53–59

The paper presents a study on question answering systems evaluation. The purpose of the study is to determine if human evaluation is indeed necessary to qualitatively measure the performance of a sociomedical dialogue system. The study is based on the data from several natural language processing experiments conducted with a question answering dataset for inclusion of people with autism spectrum disorder and state-of-the-art ...

Added: September 25, 2023

Gapping parsing using pretrained embeddings, attention mechanism and NCRF

Emelyanov A., Artemova E., , in: Computational Linguistics and Intellectual Technologies Papers from the Annual International Conference “Dialogue” (2019)Issue 18. M.: Russian State University for the Humanitie, 2019. P. 203–212.

The article is devoted to the problem of automatic gapping resolution for the Russian language. We use BERT Language Model as embeddings with bidirectional recurrent net- work, attention, and NCRF on the top. Unlike other models these are using BERT, we apply BERT only as embedder without any fine-tuning. As a result, our implementation took ...

Added: October 29, 2020

О глубинной семантике и функционально-референциальных особенностях адъективных показателей предшествования во французском и русском языках

Naberezhnova Z. G., Альманах современной науки и образования 2010 № 12 С. 226–231

...

Added: November 23, 2012

Использование информационной теории восприятия речи для анализа качества речи

Karpov N., В кн.: Современные проблемы информатизации в анализе и синтезе технологических и программно-телекоммуникационных систем: Сборник трудовВып. 17. Воронеж: Научная книга, 2012. С. 264–266.

Added: November 7, 2012

From quantitative to semantic analysis: Russian construcitons with dative subjects in diachrony

Bonch-Osmolovskaya A. A., , in: Quantitative approaches to the Russian language. Abingdon: Routledge, 2018. P. 158–174.

The paper presents diachronic study of dative subject constructions with predicatives in Russian. The dataset from corpus of 19-21 century is analysed with clustering method, the classes of predicates which examin similar behaviour are defined. Semantic interpretation is proposed for the observed distribution. ...

Added: July 14, 2017

Предикативное согласование со словами ряд, половина, часть, множество в современном русском языке

Kuvshinskaya Y. M., Сибирский филологический журнал 2019 № 2 С. 189–215

The work deals with the strategies for predicate agreement to quantified noun groups headed by nouns. In Russian, as in other Slavic languages, predicate agreement with quantified noun phrases allows singular or plural forms of the predicate. As for the sentences with quantifiers-nouns r’ad, polovina, chast’, mnozestvo, three agreement strategy are probable: predicate agrees with ...

Added: September 8, 2019

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва, 29 мая — 1 июня 2019 г.)

М.: Издательский центр «Российский государственный гуманитарный университет», 2019.

Added: October 16, 2019

TEXTS OF DIFFERENT EMOTIONAL CLASSES AND THEIR TOPIC MODELING

Kolmogorova A., Qiuhua S., Вестник Волгоградского государственного университета. Серия 2: Языкознание 2025 Vol. 23 No. 5

The article is devoted to studying verbalization specifics of various emotional states in the texts in Russian with the purpose to confirm or refute the hypothesis that texts of different emotional classes reflect the denotative situation not identically, which is reflected in thematic specifics and lexical content. The research material consisted of eight corpus texts ...

Added: November 29, 2024

SyntaxNet Errors from the Linguistic Point of View

Durandin O., Malafeev A., Zolotykh N., , in: Analysis of Images, Social Networks and Texts. 6th International Conference, 2017, Revised Selected PapersVol. 10716. Cham: Springer, 2018. Ch. 4 P. 34–46.

The paper deals with Google’s universal parser SyntaxNet. The system was used to analyze the Universal Dependencies linguistic corpora. We conducted an error analysis of the output of the parser to reveal to what extent the error types are connected with or preconditioned by the language types. In particular, we carried out several experiments, clustering ...

Added: December 1, 2017

The rise of a lingua franca: The case of Russian in Dagestan

Dobrushina N., Kultepina O., International Journal of Bilingualism 2021 Vol. 25 No. 1 P. 338–358

Aims and objectives: In Dagestan, Russian is the language of education, urban way of life, and upward social mobility, and the means of communication between speakers of different languages. This is a result of a quick and drastic change. At the end of the 19th century, Russian was spoken by less than 1% of the population. ...

Added: October 14, 2020

Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)

Association for Computational Linguistics, 2019.

Added: September 15, 2020

Style transfer in NLP: a framework and multilingual analysis with Friends TV series

Tikhonova M., Elina Telesheva, Mirzoev S. et al., , in: 2021 International Conference Engineering and Telecommunication (En&T). IEEE, 2022. P. 1–6.

Style transfer is an important and a rapidly developing of Natural Language Processing. This days more and more methods and models are proposed which allow us to generate text in predefined style. In this paper we propose a framework for style transfer of “Friends” TV series. The trained models are able to mimic one of ...

Added: May 21, 2022

Сorpus--based profiles of Russian nouns: from grammatical number to lexical semantics

Lyashevskaya O., / NRU HSE. Series WP BRP "Linguistics". 2015.

A grammatical profile which indicate the relative frequency distribution of the inflected forms of a word in a corpus is a tool for exploring lexical semantics. However the previous attempts to infer semantically relevant hierarchies of nouns from frequency biases within their grammatical forms seem to have failed. In this paper we explore the distinctive ...

Added: April 15, 2015

Метод семантичского поиска специалистов с определенным набором компетенций

Zakhlebin I. V., В кн.: Электронный бизнес. Управление интернет-проектами. Инновации: Сборник трудов участников студенческой научно-практической конференции, Москва, 12-14 марта 2013 г. М.: НИУ ВШЭ, 2014. С. 88–91.

The report deals with the methodology of building a system to perform search for specialists satisfying a defined set of competencies. The proposed search method is based on natural language texts analysis. ...

Added: July 11, 2015

Lost in Conversation: A Conversational Agent Based on the Transformer and Transfer Learning

Golovanov S., Tselousov A., Rauf Kurbanov et al., , in: The NeurIPS '18 Competition: From Machine Learning to Intelligent Conversations. Springer, 2020. P. 295–315.

Added: February 20, 2021

Лики билингвизма

СПб.: Златоуст, 2016.

This book is a collection of papers written by Russian and foreign linguists to highlight the different aspects of bilingualism. Much attention is paid to the early simultaneous and successive bilingualism in children; however, adults speaking several languages in natural settings as well as in classroom are also considered. Some chapters are concentrated on language attrition — an ...

Added: October 2, 2016

"Поверх очков": пространственные интерпретации и семантика предложной конструкции

Lyashevskaya O., Acta Linguistica Petropolitana. Труды института лингвистических исследований 2014 Т. X № 2 С. 332–361

Предлог поверх принадлежит к непервообразным предлогам, которые обладают более простой семантикой, чем многозначные первообразные предлоги. Мы представляем семантическую структуру употреблений предлога в виде радиальной категории, которая связывает между собой различные образные схемы (image schemas). Основанием для выделения классов употреблений является топологический тип фигуры и ориентира, а также функциональные отношения между ними. Необычность категории в том, ...

Added: October 7, 2014

Понятие «банкротство» в координатах правовой лингвистики: русско-англо-французские аппроксимации

Vlasenko S. V., Галимов А. Р., Вестник Тверского государственного университета. Серия: Филология 2012 Т. 10 № 2 С. 21–28

«Bankruptcy» Concept Within the Legal Linguistics Coordinates: Russian–English–French Approximations The article addresses the notion of bankruptcy as perceived by speakers of current Russian, English and French languages both lawyers and participants in professional communication from other trades. Semantic structure of the term is identified based on its lexicographic and regulatory definitions. ...

Added: October 4, 2012