The paper deals with a quantitative aspect of a Bashkir language area of the Internet. We analyze the results of the special crawler’s work. Our crawler has indexed Bashkir sites and collected the linguistic valuable data of a wordform frequency. This data differs from the word frequency in Bashkir printed texts or in Russian Internet. Most of the frequent words are marked as official. There is nearly an absence of obscene words. That means that the Bashkir Internet is not designed for any kind of communicative practices but the main goal of it is the message of the existence of the Bashkir language and it’s presence on the Web. Internet terms like “site” and others are rare in the Bashkir web-area. Also not so frequent in the Bashnet such popular words in Runet as “job”, “mobile phone” and others.
This paper studies different aspects of a linguo-political conflict concerned with choosing between two Russian toponymic variants – Belorussia and Belarus’ as well as adjectives belorusskij (Belorussian) and belarus(s)kij (Belarusian) and ethnonyms belorus and belarus. The core of the problem is that in the Russian language of Russia the variant Belorussia is used, which is considered to be insulting by many Belarusians, who prefer to use the variant Belarus while speaking Russian. In an attempt to understand the structure of this conflict, we analyze how and why the toponym Belarus appeared and spread through the newspapers of 1990-s, study the data from two online polls and the distribution of some words derived from the two toponymic variants, and finally discuss the scenarios of conflict communication in discussions in various social media. One of the polls shows the social distribution of the two toponymic variants and the other examines the attitude of the Belarusians towards the toponym Belorussia and its derivates. We show that each side of the conflict has its own limited set of ideas that reappear in conflict communication in comments under different articles on the Internet.
The paper proposes new approaches to the problem of Russian dative subjects in predicative and adjective constructions. The core idea of the research is to study the distribution of dative subject constructions with predicative and adjective forms that potentially can be used in such constructions. The methodological novelty of the approach is manifested in the following aspects. First of all the object of the research is the choice between explication or omitting the dative subject in the construction. While usually the predicates are classified on the basis whether they can in principle be used with dative subject, I study the trends for explicit use of dative (or prepositional beneficiary arguments) among the “dative subject predicates”, and show that the frequency rates of real use of dative subjects can be very different with different predicates. Secondly I regard separately different morphological forms of the same dative subject lexeme (i.e. adjectives in full and short forms, comparative adjectives and predicatives) and show that they may also reveal different strategies with explicit dative subjects. Finally I compare data from the 18th and the 21st centuries and use hierarchical clustering to reveal some diachronic trends in the use of dative subjects. The research is based on quantitative study of the examples from the Russian National Corpus.
The paper considers the less known aspects in the functioning of Russian lexical “xeno” markers, in particular, of the particle jakoby ‘allegedly, ostensibly’. Traditionally described as expressing the falsity of a proposition contained in somebody’s utterance, in conjunction with a negative assessment of the utterer as aware of its falsity, jakoby displays very different usages in the language of contemporary mass media. Namely, it is frequently used as a mere marker of evidentiality, without an obligatory assessment of the proposition as false or of its source as untruthful. In fact, it can even be used to refer to statements that are treated as true within the very same text, only to indicate that the source of this information is not the writer herself but somebody else (e.g., a different news agency), in what might be termed as “safety” strategy. Besides, jakoby in its mass media usages demonstrates unusual syntactic behaviors, namely shifts in scope, where it is placed before the speech verb rather than before the challenged proposition: jakoby utverzhdat’, chto P ‘jakoby claim that P’ instead of utverzhdat’, chto jakoby P ‘claim that jakoby P’. However, the study of the Russian-English parallel corpus reveals that these usages are not as unusual as they may appear. In Russian translations of English texts jakoby sometimes functions as a translation of the English supposedly, allegedly, ostensibly or other (e.g., verbal) markers of uncertainty, but more frequently occurs with no apparent stimulus in the source, merely to mark indirect quotation. It appears therefore that there is a certain need in the Russian language for a neutral evidentiality marker. It is occasionally filled with jakoby, which in this case displays a tendency for grammaticalization: it expresses that the source of information is other than the speaker herself (but contains no other semantic components), and takes syntactic scope over the speech verb instead of the proposition it challenges.
Analyzing several Russian nouns denoting everyday life objects, we explain why a word sense frequency dictionary is necessary. Techniques of calculating the approximate frequencies are proposed, based on the analysis of native speaker surveys and the annotation of the most frequent collocations in a large text corpus (we used the huge RuTenTen11 corpus integrated into the Sketch Engine system). A word sense dictionary could be used in a variety of NLP tasks, in particular for a probabilistic word sense disambiguation without available context, in creating second language learning resources, as well as in academic lexicography. Besides, studies of sense sets of polysemous words and their comparative frequencies are important for the linguistic theory, because they shed light on the evolution of the lexical system.
In this paper, we present a study of neologisms and loan words frequently occurring in Facebook user posts. We have collected a dataset of over 573 million posts written during 2006-2013 by Russian-speaking Facebook users. From these, we have built a vocabulary of most frequent lemmatized words missing from the Opencorpora dictionary (http://opencorpora.org/dict.php) the assumption being that many such words have entered common use only recently. This assumption is certainly not true for all the words extracted in this way; for that reason, we manually filtered the automatically obtained list in order to exclude non-Russian or incorrectly lemmatized words, as well as words recorded by other dictionaries or those occurring in pre-2000 texts from the Russian National Corpus (http:// www.ruscorpora.ru). The result is a list of 168 words that can potentially be considered neologisms. We present an attempt at an etymological classification of these neologisms (unsurprisingly, most of them have recently been borrowed from English, but there are also quite a few new words composed of previously borrowed stems) and identify various derivational patterns. We also classify words into several large thematic areas, "internet", "marketing", and "multimedia" being among those with the largest number of words. We consider our results preliminary, but believe that, together with the word base collected in the process, they can serve as a starting point in further studies of neologisms and lexical processes that lead to their acceptance into the mainstream language.
In this paper, we present a study of neologisms and loan words frequently occurring in Facebook user posts. We have collected a dataset of over 573 million posts written during 2006–2013 by Russian-speaking Facebook users. From these, we have built a vocabulary of most frequent lemmatized words missing from the Opencorpora dictionary (http://opencorpora.org/dict.php) the assumption being that many such words have entered common use only recently. This assumption is certainly not true for all the words extracted in this way; for that reason, we manually filtered the automatically obtained list in order to exclude non-Russian or incorrectly lemmatized words, as well as words recorded by other dictionaries or those occurring in pre-2000 texts from the Russian National Corpus (http://www.ruscorpora.ru). The result is a list of 168 words that can potentially be considered neologisms.
We present an attempt at an etymological classification of these neologisms (unsurprisingly, most of them have recently been borrowed from English, but there are also quite a few new words composed of previously borrowed stems) and identify various derivational patterns. We also classify words into several large thematic areas, “internet”, “marketing”, and “multimedia” being among those with the largest number of words.
We consider our results preliminary, but believe that, together with the word base collected in the process, they can serve as a starting point in further studies of neologisms and lexical processes that lead to their acceptance into the mainstream language.
The paper discusses different modes of evaluation in Russian. Evaluation is considered as a speech act based on a cognitive procedure which has the following form: (i) evaluation of an object X as possessing a feature q consists of comparing of parameter Q with X and picking out of q as a function of Q with an argument of X; (ii) the feature q presupposes recommendations for decision making in connection with an object X. Cognitive procedure of description of an object X as possessing of a feature q doesn’t presupposes any recommendation for decision making. In some discursive modes semantics of evaluation lose its influence force oratleastitgettingweaker. Discursivemodeisdefinedas a sphere of functioning of speech forms in discourse, in which their meaning regularly changed. Different discourses allow different kinds of discursive modes. In the paper are discussed the following discursive modes, which modify evaluation force: irony, language game, common nomination, indefinite reference.
The article suggests a way of modelling the linear position of appellatives in Russian. Under the name «appellatives» are combined the units with similar functions and syntactic properties, namely truncated vocative forms and discursive markers of the type «slushaj» (lit. ‘listen-Imp.2P’). The model assumes distinction between accented and non-accented uses in three positions (initial, middle, final) and takes into account the pauses and some peculiarities of dialogue organization. The position types are the following: (1) «initial isolated», (2) «middle accented», (3) «final accented», (4) «initial cooperated», (5) «middle non-accented» and (6) «final non-accented». The sample from the speech corpus is analyzed. It turned out that the vocatives can be placed in all the listed position types, while discursive marker «slushaj» does not demonstrate reliable uses in positions of the type (2) and (3). For the sequences like slushaj+Voc, nu+Voc, nu+listen etc. the notion of «appellative complex» is introduced. The composition of appellative complexes according to sample data is considered.
The paper considers the senses of the Russian adjective poslednij ‘last’. Its polysemy is analyzed as deriving from a certain core semantic structure that is common to all its meanings. The core structure has two semantic valencies – of a sequence and of a sequence element. Modifications of the core structure, including additional valencies (point of reference and landmark) account for its polysemy, as well as for diversity of its collocational and syntactic properties. The paper also demonstrates the role of pragmatics and lexicalization of grammatical and syntactic forms in disambiguating different meanings of poslednij, against the backdrop of its English correlates.
In the paper words spravedlivost’ (justice) and nespravedlivost’ (injustice) in Russian and their corresponding concepts are considered. It is shown that formally words spravedlivost’ and nespravedlivost’ are antonyms, because morphologically they differ only in morpheme ne- (“no”). But their meanings differ in a more complicated way. Word spravedlivost’ has an abstract meaning, it denotes a value category. At the meantime extensional set of the word nespravedlivost’ is another one: it is used for denoting of wide range of situations where features of justice as a value concept are violated. For this reason word spravedlivost’ de facto is singularia tantum: it has not plural. At the same time the word nespravedlivost’ (injustice) has in Russian speech both forms: singular, as well as plural. Differences in semantics between two words under consideration become apparent in metaphorical models which are used by speakers in interpretation of justice an injustice in Russian public discourse, model of which is text corpus of print media.