Word Sense Disambiguation for Russian Verbs Using Semantic Vectors and Dictionary Entries
Word sense disambiguation (WSD) methods are useful for many NLP tasks that require semantic interpretation of input. Furthermore, such methods can help estimate word sense frequencies in different corpora, which is important for lexicographic studies and language learning resources. Although previous research on Russian polysemous verbs disambiguation established some important and interesting results, it was mostly focused on reducing ambiguity or determining the most frequent sense, but not on evaluating WSD accuracy. To the best of our knowledge, there is no comprehensively evaluated method that can perform semi-supervised word sense disambiguation for Russian verbs. In this paper we present a WSD method for verbs that is able to reach an average disambiguation accuracy of 75% using only available linguistic resources: examples and collocations from the Active Dictionary of Russian and large unlabeled corpora. We evaluate the method on contexts sampled from the web-based corpus RuTenTen11 for 10 verbs with 100 contexts for each verb. We compare different variations of the method and analyze its limitations. Method’s implementation and labeled contexts are available online.
The paper deals with comparative analysis of two groups of polysemantic words in Russian and in English proving the idea of relativity of the structural scheme of the polysemantic word. The specificity of the lingual unit is always relative and depends on the background against which the comparison is performed.
When words have several senses, it is important to describe them properly in dictionary (a lexicographic task) and to be able to distinguish them in a given context (a computational linguistics task, WSD). Different senses normally have different frequencies in corpora. We introduced several techniques for determining sense frequency based on dictionary entries matched with data from large corpora. Information about word sense frequency is not only useful for explanatory lexicography and WSD, but it also may enrich language learning resources. Learners of a foreign language who encounter a word similar to one of their native language are often tempted to assume that the foreign word and its equivalent have the same meaning structure. Sometimes, however, this is not the case, and the most frequent sense of a word in one language may be much less frequent for its cognate. We proposed a method for detecting such cases. Having selected a set of Russian words included into the Active Dictionary of Russian which have more than two dictionary senses and have cognates in English, we estimated the frequencies for English and Russian senses using SemCor and Russian National Corpus respectively, matched the senses in each pair of words and compared their frequencies. Thus we revealed cases in which the most frequent senses and whole meaning structures are, cross-linguistically, substantially different and studied them in more detail. This technique can be applied not only to cognates, but also to pairs of words which are usually offered by the dictionaries as the translation equivalents of each other.
The collected papers contain articles by famous and young scientists on actual problems of philology (cognitive linguistics, lexical semantics, semiotics, pragmatics, text linguistics, stylistics; poetics, literary criticism; translation, intercultural communication). The issue also presents research on foreign language teaching methods. The edition is addressed to linguists, translators, teachers, postgraduates, students and a wide readership.
These proceedings include papers on subjects from a wide number of areas including theoretical linguistics, translation, computational linguistics, natural language processing, and applied linguistics, focusing on a variety of languages, ranging from familiar Indo-European languages to Mandarin Chinese, Wolof, and Dene Sųɬiné. In order to make the papers available to the wider research community, these proceedings are being published electronically and distributed freely at http://www.meaningtext.net
The monograph presents a scientific and theoretical edition and devoted to the problem of polysemy with a cognitive-semantic context, in particular, complex, not single metaphors and their cognitive heuristic function. The study is based on further development and elaboration of the theory of custody schemes of J. Lakoff and M. Johnson, and considers a metaphor as a tool for systemic understanding of deep conceptual spheres, giving the structural cohesion to the human experience. Cognitive approach to the scientific study assumes that the entire role in the formation of language values belongs to the person as a participant of communication, the observer and the media of knowledge and some experience.
The article is implemented within the cognitive approach and is dedicated to the formation of a substantive core of the polysemantic verb, particularly of the verb of relations in the modern English. The first part of the article presents points of view on the semantic nature - "content plan" - of the word. In the second part of the paper the authors identify the main cognitive mechanisms underlying the formation of meanings of the verb compose, as well as its substantial core that combines all the lexical-semantic variants of a this verb.
The paper discusses evaluation techniques for semantic role labeling in Russian. It has been shown that the quality of FrameNet-style semantic role labeling largely depends on the quantity of roles and may decrease if the inventory of roles in the training set differs from that in the output resource. Our study is the first step towards the ‘smart’ evaluation tool which would introduce linguistically relevant criteria to evaluation; be able to put the mistakes on a scale from minor to critical ones; make evaluation easier in case the grid of roles varies.
We run an experiment based on the data from the Russian FrameBank, a FrameNet-oriented open access database which includes a dictionary of Russian lexical constructions and a corpus of tagged examples. The semantic role is one of the parameters that define the predicate-argument patterns in FrameBank. The inventory of roles is modeled hierarchically and
forms a graph. We explore the cases when the role induced by the system and the answer of the gold standard do not match. We analyze the statistical criteria of distribution of roles in the patterns and the distance between the source and the target in the graph of roles as a mean to assess the goodness of fit.
In this paper we consider choice problems under the assumption that the preferences of the decision maker are expressed in the form of a parametric partial weak order without assuming the existence of any value function. We investigate both the sensitivity (stability) of each non-dominated solution with respect to the changes of parameters of this order, and the sensitivity of the set of non-dominated solutions as a whole to similar changes. We show that this type of sensitivity analysis can be performed by employing techniques of linear programming.