Evaluation of frame-semantic role labeling in a case-marking language

O. Lyashevskaya; Kashkin E.

?

Evaluation of frame-semantic role labeling in a case-marking language

Компьютерная лингвистика и интеллектуальные технологии. 2014. No. 20. P. 362–378.

The paper discusses evaluation techniques for semantic role labeling in Russian. It has been shown that the quality of FrameNet-style semantic role labeling largely depends on the quantity of roles and may decrease if the inventory of roles in the training set differs from that in the output resource. Our study is the first step towards the ‘smart’ evaluation tool which would introduce linguistically relevant criteria to evaluation; be able to put the mistakes on a scale from minor to critical ones; make evaluation easier in case the grid of roles varies.

We run an experiment based on the data from the Russian FrameBank, a FrameNet-oriented open access database which includes a dictionary of Russian lexical constructions and a corpus of tagged examples. The semantic role is one of the parameters that define the predicate-argument patterns in FrameBank. The inventory of roles is modeled hierarchically and

forms a graph. We explore the cases when the role induced by the system and the answer of the gold standard do not match. We analyze the statistical criteria of distribution of roles in the patterns and the distance between the source and the target in the graph of roles as a mean to assess the goodness of fit.

Research target: Philology and Linguistics

Language: English

Full text

Text on another site

Keywords: corpus linguistics constructions polysemy evaluation of lexical resources semantic roles semantic role labeling lexical resources

Inducing verb classes from frames in Russian: morpho-syntax and semantic roles

Lyashevskaya O., Kashkin E., Компьютерная лингвистика и интеллектуальные технологии 2015 Vol. 14 P. 427–440

The paper presents clustering experiments on Russian verbs based on the statistical data drawn from the Russian FrameBank (framebank.ru). While lexicology has essentially abandoned the idea of syntactic transformations as the primary basis for grouping verbs into semantic classes (Apresjan 1967, Levin 1993), the hypothesis of the same lexical and syntactic distributional profiles underlying lexical ...

Added: March 27, 2015

FrameBank: a database of Russian lexical constructions

Kashkin E., Communications in Computer and Information Science, Springer, Germany 2015 Vol. 542 P. 337–348

Added: September 30, 2015

Лексическое значение в фокусе корпуса текстов

Botchkarev A., Вестник Волгоградского государственного университета. Серия 2: Языкознание 2015 Т. 28 № 4 С. 144–151

The methods and procedures of task solving in the field of semantics are inevitably influenced by the paradigm change. In particular, using text corpora, it is possible to re-examine problems that are traditional for lexical semantics, such as polysemy, the unity of meaning as well as opportunities for interpretation of units provided by a context ...

Added: January 26, 2016

FrameBank: a database of Russian lexical constructions

Lyashevskaya O., Kashkin E., , in: Analysis of Images, Social Networks and Texts. 4th International Conference, AIST 2015, Yekaterinburg, Russia, April 9–11, 2015, Revised Selected PapersVol. 542: Series: Communications in Computer and Information Science.: Switzerland: Springer, 2015. Ch. 34 P. 337–348.

Russian FrameBank is a bank of annotated samples from the Russian National Corpus which documents the use of lexical constructions (e.g. argument constructions of verbs and nouns). FrameBank belongs to FrameNet-oriented resources, but unlike Berkeley FrameNet it focuses more on the morphosyntactic and semantic features of individual lexemes rather than the generalized frames, following the ...

Added: April 11, 2015

Автоматическое определение частей речи для русского языка с помощью обучения трансформаций.

Kitov V. V., Научные труды Вольного экономического общества России 2014 Т. 186 С. 228–235

This paper describes the application of well-known «transformation-based learning» algorithm of automatic rule generation for the task of part-of-speech tagging. Algorithm is applied to corpora of annotated Russian texts and accuracy as well as most significant rules are shown. ...

Added: March 16, 2016

RUSSE2018: a Shared Task on Word Sense Induction for the Russian Language

Panchenko A., Lopukhina A., Ustalov D. et al., Компьютерная лингвистика и интеллектуальные технологии 2018 No. 17 P. 547–564

The paper describes the results of the first shared task on word sense induction (WSI) for the Russian language. While similar shared tasks were conducted in the past for some Romance and Germanic languages, we explore the performance of sense induction and disambiguation methods for a Slavic language that shares many features with other Slavic ...

Added: June 7, 2018

Прогностическая валидность глагольных форм длительного аспекта в корпусной лингвистике английского языка

Popkova E., Социосфера 2010 № 4 С. 74–81

The article discusses the most recent trends in the development of the English progressive. A corpus-based approach to linguistic research is seen as an effective means of determining reliability of the data retrieved and helps track the major diachronic dynamic in the increasing frequency of the progressive aspect that has taken place since the beginning ...

Added: November 6, 2012

После, через, спустя во временны́х контекстах: из наблюдений над текстами казахско-русских билингвов

Rakhilina E. V., Казкенова А. К., Akhapkina Y., Вестник Томского государственного университета. Филология 2021 Т. 73 С. 93–113

Рассматриваются случаи нестандартного употребления казахско-русскими билингвами предлогов после, через и спустя во временны́х контекстах. Доказывается, что отклонения обусловлены грамматическими различиями между родным и русским языками. Анализ отклонений выявил специфические черты предлогов: способность указывать на завершение событий и отрезков времени, как единичных, так и повторяющихся, а также неоднозначность через в составе сочетаний с названиями разных временны́х интервалов. ...

Added: December 1, 2021

Корпусный анализ русского стиха

М.: Азбуковник, 2013.

В настоящий сборник вошли статьи, подготовленные с использованием материалов поэтического корпуса Национального корпуса русского языка. Авторы статей прослеживают на обширном материале историю отдельных слов в языке поэзии, анализируют разные аспекты поэтической грамматики и семантики, рассматривают некоторые формальные параметры русского стиха. Сборник предназначен для специалистов в области лингвистической поэтики, стиховедения, а также для тех, кто интересуется современными ...

Added: September 28, 2013

Еще раз об исследовательском потенциале поэтического корпуса: метр, лексика, формула

Orekhov B., Труды института русского языка им. В.В. Виноградова 2015 № 6 С. 449–463

The article continues the trend of other researchers’ publications that demonstrate the opportunities of the poetic subcorpus of the Russian National corpus. The question is, what issues related to the history of Russian poetry can be solved with the help of the corpus. In the first part of the article there is a pilot study ...

Added: March 16, 2016

Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022)

Marseille: European Language Resources Association (ELRA), 2022.

The proceedings are organised on the basis of the 22 Tracks of the Conference on Language Resources and Evaluation (LREC) held in Marseille, France, from 20 to 25 June 2022. Major topics include corpora and annotation (including tools, systems, treebanks), information extraction and information retrieval (including ner, qa, text mining, document classification, text categorisation), applications involving lrs and evaluation (including ...

Added: February 22, 2023

Using TXM Platform for Research on Language Changes over Time: The Dynamics of Vocabulary and Punctuation in Russian Literary Texts

Lavrentiev A. M., Sherstinova T., Chepovskiy A. et al., Vestnik Tomskogo Gosudarstvennogo Universiteta, Filologiya 2021 Vol. 70 P. 69–89

The purpose of this paper is to test the methodological tools provided by TXM platform for research on dynamics of vocabulary and punctuation marks in diachronic corpora. TXM is a powerful text analysis software which provides both quantitative and qualitative features in a transparent open-source implementation. In this paper, we demonstrate how it can be ...

Added: June 24, 2021

Корпусные исследования особенностей речи нестандартных говорящих ("херитажный русский")

Rakhilina E. V., Марушкина А. С., Acta Linguistica Petropolitana. Труды института лингвистических исследований 2015 Т. XI № 1 С. 621–639

The paper presents an analysis of comparative, conditional and prepositional constructions in the speech of heritage speakers of Russian and learners of Russian as a second language on the material from the Russian Learner Corpus. ...

Added: July 25, 2015

Свойства дискурсивных формул на примере русских конструкций ты что и что ты

Bychkova P., Русский язык в научном освещении 2020 № 2 (40) С. 88–111

The paper discusses semantic description of the so-called discourse formulae, idiomatic expressions used as speaker's reactions in a dialogue. They are considered in the framework of construction grammar, as a peripheral class of constructions with its specific properties. A case study of two synonymous Russian discourse formulae TY ČTO and ČTO TY provides an account ...

Added: September 23, 2020

Input a Word, Analyze the World

Newcastle upon Tyne: Cambridge Scholars Publishing, 2016.

Input a Word, Analyze the World represents current perspectives on Corpus Linguistics (CL) from a variety of linguistic subdisciplines. Corpus Linguistics has proven itself an excellent methodology for the study of language variation and change, and is well-suited for interdisciplinary collaboration, as shown by the studies in this volume. Its title is inspired by the ...

Added: October 15, 2016

К вопросу о согласовании времен в современном русском языке: Корпусное исследование дистрибутивных характеристик временных форм в сентенциальных актантах

Schnittke E., Вопросы языкознания 2020 № 3 С. 26–51

The study examines tense variation in complement and subject clauses subordinate to and co-temporal with matrix past tense verbs in Russian. The semantics of the matrix verb is commonly named as one of the major factors that govern tense choice in complement and subject clauses: verbs of speech are said to exclusively license present tense ...

Added: October 23, 2019

Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories (TLT 16)

Association for Computational Linguistics, 2017.

The volume includes papers presented at the 16th International Workshop on Treebanks and Linguistic Theories (TLT), which brings together developers and users of linguistically annotated natural language corpora. As ‘treebanks’ we consider any pairing of natural language data (spoken or written) with annotations of linguistic structure at various levels of analysis, ranging from e.g. morpho-phonology ...

Added: December 11, 2018

Referential Choice: Predictability and Its Limits

Kibrik A. A., Khudyakova M., Dobrov G. B. et al., Frontiers in Psychology 2016 Vol. 7 No. 1429 P. 1–21

We report a study of referential choice in discourse production, understood as the choice between various types of referential devices, such as pronouns and full noun phrases. Our goal is to predict referential choice, and to explore to what extent such prediction is possible. Our approach to referential choice includes a cognitively informed theoretical component, ...

Added: September 28, 2016

Representation of Different Types of Adjectival Polysemy in the Mental Lexicon

Apresyan V., Lopukhina A., Zarifyan M., Frontiers in Psychology 2021 Vol. 12 Article 742064

We studied mental representations of literal, metonymically different, and metaphorical senses in Russian adjectives. Previous studies suggested that in polysemous words, metonymic senses, being more sense-related, were stored together with literal senses, whereas more distant metaphorical senses had separate representations. We hypothesized that metonymy may be heterogeneous with respect to its mental storage. “Whole-part” metonymy ...

Added: October 29, 2021

Полисемия в списках самодийской базисной лексики и языковые контакты

Fedotova I., Урало-алтайские исследования 2020 № 2 (37) С. 77–113

This paper investigates cases of semantic shifts and proto-language polysemy in the Samoyed core lexicon. This research focuses on the shifts which have analogies in Turkic and Tungusic languages, identified with the help of semantic reconstruction. Special maps were created at LingvoDoc linguistic platform in order to demonstrate areas of similar polysemy and semantic shifts, ...

Added: October 19, 2020

Корпусные методы исследования сложных случаев полисемии

Krongauz M., В кн.: Методы когнитивного анализа семантики слова: компьютерно-корпусный подход.: Издательский дом ЯСК, 2019. С. 119–140.

В настоящей работе анализируются сложные случаи полисемии в русском языке с использованием корпусных методов ...

Added: December 6, 2019

Russian Minority Languages on the Web: Descriptive Statistics

Orekhov B., Krylova I., Popov I. et al., Компьютерная лингвистика и интеллектуальные технологии 2016 No. 15 (22) P. 452–461

Статья о малых языках России в Интернете ...

Added: November 7, 2017

Literature, Language and Computing: Russian Contribution

Springer, 2023.

This book brings together selected revised papers representing a multidisciplinary approach to language and literature. The collection presents studies performed using the methods of computational linguistics in accordance with the traditions of Russian linguistic and literary studies, primarily in line with the Leningrad (Petersburg) philological school. The book comprises the papers allocated into 2 sections ...

Added: September 15, 2023

О способах и средствах выражения страха в русской языковой картине мира

Botchkarev A., Вестник Новосибирского государственного университета. Серия: Лингвистика и межкультурная коммуникация 2016 Т. 14 № 3 С. 5–14

This article explores the ways of displaying fear in the Russian language image of the world. According to the National Corpus of the Russian language, in its most usual manifestation, fear covers and paralyzes; this distressing emotion is caused by somebody, apprehension to lose something or somebody as well as by exposure to an imminent ...

Added: November 28, 2016