• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site
Of all publications in the section: 33
Sort:
by name
by year
Article
Сомин А. А., Полий А. А. Компьютерная лингвистика и интеллектуальные технологии. 2016. № 15 (22). С. 645-659.

This paper studies different aspects of a linguo-political conflict concerned with choosing between two Russian toponymic variants – Belorussia and Belarus’ as well as adjectives belorusskij (Belorussian) and belarus(s)kij (Belarusian) and ethnonyms belorus and belarus. The core of the problem is that in the Russian language of Russia the variant Belorussia is used, which is considered to be insulting by many Belarusians, who prefer to use the variant Belarus while speaking Russian. In an attempt to understand the structure of this conflict, we analyze how and why the toponym Belarus appeared and spread through the newspapers of 1990-s, study the data from two online polls and the distribution of some words derived from the two toponymic variants, and finally discuss the scenarios of conflict communication in discussions in various social media. One of the polls shows the social distribution of the two toponymic variants and the other examines the attitude of the Belarusians towards the toponym Belorussia and its derivates. We show that each side of the conflict has its own limited set of ideas that reappear in conflict communication in comments under different articles on the Internet.

Added: Oct 20, 2016
Article
Иомдин Б. Л., Иомдин Л. Л. Компьютерная лингвистика и интеллектуальные технологии. 2020. № 19. С. 400-415.

The paper discusses valency frames of a number of Russian verbal predicates whose semantics includes speech acts and, at a cetrain step of semantic decomposition, the negation, like vozražat’ ‘object, retort’, vozmuščat’sja ‘resent, be indignant’ or izvinjat’sja ‘apologize’. It is hypothesized that the frames of such predicates include a pair of propositional valencies distinctly opposed to each other: (1) the valency of stimulus that expresses the state of events and (2) the valency of response that introduces a speech act performed by the subject as a reaction to this state of event and offering an explanation. For example, in the sentence Ivan izvinilsja, čto ne prišel na moj den’ rożdenija ‘Ivan apologized that he did not come to my birthday party’ the clause starting with čto ‘that’ represents the state of events, whilst in the sentence Ivan izvinilsja, čto ploxo sebja čuvstvoval ‘Ivan apologized that he was not feeling well’ the čto-clause introduces Ivan’s response to the stimulus (e.g. of not coming to the birthday party). It is shown that these valencies cannot be adequately described with a single semantic role of content. The authors also give a generalization of this phenomenon, comparing it to other instances of valency pairs, and suggest the existence of predicates having two valency centers.

Added: Sep 12, 2020
Article
Апресян В. Ю. Компьютерная лингвистика и интеллектуальные технологии. 2016. С. 16-27.
Added: Mar 7, 2017
Article
Бонч-Осмоловская А. А. Компьютерная лингвистика и интеллектуальные технологии. 2015. Т. 1. № 14(21). С. 80-95.

The paper proposes new approaches to the problem of Russian dative subjects in predicative and adjective constructions. The core idea of the research is to study the distribution of dative subject constructions with predicative and adjective forms that potentially can be used in such constructions. The methodological novelty of the approach is manifested in the following aspects. First of all the object of the research is the choice between explication or omitting the dative subject in the construction. While usually the predicates are classified on the basis whether they can in principle be used with dative subject, I study the trends for explicit use of dative (or prepositional beneficiary arguments) among the “dative subject predicates”, and show that the frequency rates of real use of dative subjects can be very different with different predicates. Secondly I regard separately different morphological forms of the same dative subject lexeme (i.e. adjectives in full and short forms, comparative adjectives and predicatives) and show that they may also reveal different strategies with explicit dative subjects. Finally I compare data from the 18th and the 21st centuries and use hierarchical clustering to reveal some diachronic trends in the use of dative subjects. The research is based on quantitative study of the examples from the Russian National Corpus. 

Added: Apr 15, 2015
Article
Апресян В. Ю., Шмелев А. Д. Компьютерная лингвистика и интеллектуальные технологии. 2017. Т. 2. С. 17-29.

The paper considers the less known aspects in the functioning of Russian lexical “xeno” markers, in particular, of the particle jakoby ‘allegedly, ostensibly’. Traditionally described as expressing the falsity of a proposition contained in somebody’s utterance, in conjunction with a negative assessment of the utterer as aware of its falsity, jakoby displays very different usages in the language of contemporary mass media. Namely, it is frequently used as a mere marker of evidentiality, without an obligatory assessment of the proposition as false or of its source as untruthful. In fact, it can even be used to refer to statements that are treated as true within the very same text, only to indicate that the source of this information is not the writer herself but somebody else (e.g., a different news agency), in what might be termed as “safety” strategy. Besides, jakoby in its mass media usages demonstrates unusual syntactic behaviors, namely shifts in scope, where it is placed before the speech verb rather than before the challenged proposition: jakoby utverzhdat’, chto P ‘jakoby claim that P’ instead of utverzhdat’, chto jakoby P ‘claim that jakoby P’. However, the study of the Russian-English parallel corpus reveals that these usages are not as unusual as they may appear. In Russian translations of English texts jakoby sometimes functions as a translation of the English supposedly, allegedly, ostensibly or other (e.g., verbal) markers of uncertainty, but more frequently occurs with no apparent stimulus in the source, merely to mark indirect quotation. It appears therefore that there is a certain need in the Russian language for a neutral evidentiality marker. It is occasionally filled with jakoby, which in this case displays a tendency for grammaticalization: it expresses that the source of information is other than the speaker herself (but contains no other semantic components), and takes syntactic scope over the speech verb instead of the proposition it challenges.

Added: Aug 17, 2017
Article
Иомдин Б. Л., Лопухина А. А., Носырев Г. В. Компьютерная лингвистика и интеллектуальные технологии. 2014. № 13. С. 204-229.

Analyzing several Russian nouns denoting everyday life objects, we explain why a word sense frequency dictionary is necessary. Techniques of calculating the approximate frequencies are proposed, based on the analysis of native speaker surveys and the annotation of the most frequent collocations in a large text corpus (we used the huge RuTenTen11 corpus integrated into the Sketch Engine system). A word sense dictionary could be used in a variety of NLP tasks, in particular for a probabilistic word sense disambiguation without available context, in creating second language learning resources, as well as in academic lexicography. Besides, studies of sense sets of polysemous words and their comparative frequencies are important for the linguistic theory, because they shed light on the evolution of the lexical system.

Added: Oct 11, 2016
Article
Объедков С. А., Панченко А. И., Муравьев Н. А. Компьютерная лингвистика и интеллектуальные технологии. 2014. С. 440-454.

In this paper, we present a study of neologisms and loan words frequently occurring in Facebook user posts. We have collected a dataset of over 573 million posts written during 2006-2013 by Russian-speaking Facebook users. From these, we have built a vocabulary of most frequent lemmatized words missing from the Opencorpora dictionary (http://opencorpora.org/dict.php) the assumption being that many such words have entered common use only recently. This assumption is certainly not true for all the words extracted in this way; for that reason, we manually filtered the automatically obtained list in order to exclude non-Russian or incorrectly lemmatized words, as well as words recorded by other dictionaries or those occurring in pre-2000 texts from the Russian National Corpus (http:// www.ruscorpora.ru). The result is a list of 168 words that can potentially be considered neologisms. We present an attempt at an etymological classification of these neologisms (unsurprisingly, most of them have recently been borrowed from English, but there are also quite a few new words composed of previously borrowed stems) and identify various derivational patterns. We also classify words into several large thematic areas, "internet", "marketing", and "multimedia" being among those with the largest number of words. We consider our results preliminary, but believe that, together with the word base collected in the process, they can serve as a starting point in further studies of neologisms and lexical processes that lead to their acceptance into the mainstream language.

Added: Oct 19, 2017
Article
Муравьев Н. А., Панченко А. И., Объедков С. А. Компьютерная лингвистика и интеллектуальные технологии. 2014. № 13. С. 440-454.

In this paper, we present a study of neologisms and loan words frequently occurring in Facebook user posts. We have collected a dataset of over 573 million posts written during 2006–2013 by Russian-speaking Facebook users. From these, we have built a vocabulary of most frequent lemmatized words missing from the Opencorpora dictionary (http://opencorpora.org/dict.php) the assumption being that many such words have entered common use only recently. This assumption is certainly not true for all the words extracted in this way; for that reason, we manually filtered the automatically obtained list in order to exclude non-Russian or incorrectly lemmatized words, as well as words recorded by other dictionaries or those occurring in pre-2000 texts from the Russian National Corpus (http://www.ruscorpora.ru). The result is a list of 168 words that can potentially be considered neologisms.

We present an attempt at an etymological classification of these neologisms (unsurprisingly, most of them have recently been borrowed from English, but there are also quite a few new words composed of previously borrowed stems) and identify various derivational patterns. We also classify words into several large thematic areas, “internet”, “marketing”, and “multimedia” being among those with the largest number of words.

We consider our results preliminary, but believe that, together with the word base collected in the process, they can serve as a starting point in further studies of neologisms and lexical processes that lead to their acceptance into the mainstream language. 

 

Added: Mar 11, 2015
Article
Баранов А. Н. Компьютерная лингвистика и интеллектуальные технологии. 2016. № 15. С. 72-83.

The paper discusses different modes of evaluation in Russian. Evaluation is considered as a speech act based on a cognitive procedure which has the following form: (i) evaluation of an object X as possessing a feature q consists of comparing of parameter Q with X and picking out of q as a function of Q with an argument of X; (ii) the feature q presupposes recommendations for decision making in connection with an object X. Cognitive procedure of description of an object X as possessing of a feature q doesn’t presupposes any recommendation for decision making. In some discursive modes semantics of evaluation lose its influence force oratleastitgettingweaker. Discursivemodeisdefinedas a sphere of functioning of speech forms in discourse, in which their meaning regularly changed. Different discourses allow different kinds of discursive modes. In the paper  are discussed the following discursive modes, which modify evaluation force: irony, language game, common nomination, indefinite reference.

Added: Oct 20, 2016
Article
Летучий А. Б. Компьютерная лингвистика и интеллектуальные технологии. 2013. № 12 (19). С. 420-434.
Added: Jun 13, 2013
Article
Апресян В. Ю., Шмелев А. Д. Компьютерная лингвистика и интеллектуальные технологии. 2016. С. 28-39.

The paper considers the senses of the Russian adjective poslednij ‘last’. Its polysemy is analyzed as deriving from a certain core semantic structure that is common to all its meanings. The core structure has two semantic valencies – of a sequence and of a sequence element. Modifications of the core structure, including additional valencies (point of reference and landmark) account for its polysemy, as well as for diversity of its collocational and syntactic properties. The paper also demonstrates the role of pragmatics and lexicalization of grammatical and syntactic forms in disambiguating different meanings of poslednij, against the backdrop of its English correlates.

Added: Mar 7, 2017
Article
Баранов А. Н. Компьютерная лингвистика и интеллектуальные технологии. 2015. Т. 1. № 14. С. 19-29.

In the paper words spravedlivost’ (justice) and nespravedlivost’ (injustice) in Russian and their corresponding concepts are considered. It is shown that formally words spravedlivost’ and nespravedlivost’ are antonyms, because morphologically they differ only in morpheme ne- (“no”). But their meanings differ in a more complicated way. Word spravedlivost’ has an abstract meaning, it denotes a value category. At the meantime extensional set of the word nespravedlivost’ is another one: it is used for denoting of wide range of situations where features of justice as a value concept are violated. For this reason word spravedlivost’ de facto is singularia tantum: it has not plural. At the same time the word nespravedlivost’ (injustice) has in Russian speech both forms: singular, as well as plural. Differences in semantics between two words under consideration become apparent in metaphorical models which are used by speakers in interpretation of justice an injustice in Russian public discourse, model of which is text corpus of print media.

Added: Oct 21, 2016
1 2