Frequency dictionary of inflectional paradigms: core Russian vocabulary
A new kind of frequency dictionary is a valuable reference for researchers and learners of Russian. It shows the grammatical profiles of nouns, adjectives and verbs, namely, the distribution of grammatical forms in the inflectional paradigm. The dictionary is based on data from the Russian National Corpus (RNC) and covers a core vocabulary (5000 most frequently used lexemes). Russian is a morphologically rich language: its noun paradigms harbor two dozen case & number forms and verb paradigms include up to 160 grammatical forms. The dictionary departs from traditional frequency lexicography in several ways: 1) word forms are arranged in paradigms, and their frequencies can be compared and ranked; 2) the dictionary is focused on the grammatical profiles of individual lexemes rather than on overall distribution of grammatical features (e.g. the fact that Future forms are used less frequently than Past forms); 3) grammatical profiles of lexical units can be compared against the mean scores of their lexico-semantic class; 4) in each part of speech or semantic class, lexemes with certain biases in grammatical profile can be easily detected (e.g. verbs used mostly in Imperative, in Past neutral, or nouns used often in plural); 5) the distribution of homonymous word forms and grammatical variants can be followed in time and within certain genres and registers. The dictionary will be a source for research in the field of Russian grammar, paradigm structure, form acquisition, grammatical semantics, as well as variation of grammatical forms. The main challenge for this initiative is the intra-paradigm and inter-paradigm homonymy of word forms in corpus data. Manual disambiguation is accurate but covers ca. 5 million words in the RNC, so the data may be sparse and possibly unreliable. Automatic disambiguation yields slightly worse results, however, a larger corpus shows more reliable data for rare word forms. A user can switch between a ‛basicʼ version which is based on a smaller collection of manually disambiguated texts, and an ‛expandedʼ version which is based on the main corpus, the newspaper corpus, the corpus of poetry and the spoken corpus (320 million words in total). The article addresses some general issues such as establishing the common basis of comparison, a level of granularity of grammatical profile, units of measurement. We suggest certain solutions related to the selection of data, corpus data processing and maintaining the online version of the frequency dictionary.
В кн.: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной Международной конференции «Диалог» (Бекасово, 29 мая - 2 июня 2013 г.). В 2-х т.. Т. 1: Основная программа конференции. Вып. 12 (19).: М.: РГГУ, 2013.. С. 478-489.
A new electronic frequency dictionary shows the distribution of grammatical forms in the inflectional paradigm of Russian nouns, adjectives and verbs, i.e. the grammatical profile of individual lexemes and lexical groups. While the frequency hierarchy of grammatical categories (e.g. the frequency of part of speech classes or the average ratio of Nominative to Instrumental case ...
Added: May 13, 2013
Slovĕne 2018 № 1 С. 424-436
The paper is an overview of the Repository of Variationist Research (https://vastry.ru/), an online storage and interactive plotting tool for quantitative sociolinguistic data. The paper describes a number of sociolinguistic experiments from which the data come and outlines the Repository and the toolkit it provides to its users. ...
Added: August 18, 2018
A Data Analysis Tool for the Corpus of Russian Poetry / НИУ ВШЭ. Series WP BRP "Linguistics". 2018. No. 77.
A data analysis tool of the Corpus of Russian Poetry (a part of the Russian National Corpus) is designed for quantitative research in various areas of versology and linguistics aspects of poetic texts. The core part, a statistic database of the corpus, includes annotation at the level of texts, verses, words as well as patterns ...
Added: December 13, 2018
М.: Языки славянской культуры, 2016
Corpus linguistics can be broadly defined in terms of two partially overlapping research dimensions . On the one hand, corpus linguistics is knowledge of how to compile and annotate linguistic corpora. On the other hand, corpus linguistics is a family of qualitative and quantitative methods of language study based on corpus data. The book presents ...
Added: March 26, 2015
О чувстве долга как лингвоспецифичном концепте русского языка (в фокусе Национального корпуса русского языка)
Вестник Санкт-Петербургского университета. Язык и литература 2019 Т. 16 № 1 С. 20-32
The article explores a sense of duty as a language-specific concept in the Russian language conscience. In this regard, the National Russian Corpus is more appropriate because a conceptual configuration of an analyzed concept is not present in “finished” form in any single utterance but may be reconstructed only on the totality of all possible ...
Added: April 2, 2019
Адъективные средства выражения предшествования в субъектных, объектных и предикативных группах во французском и русском языке (сопоставительный анализ)
Альманах современной науки и образования 2010 № 11(42) Ч.1 С. 161-167
Время - универсальная междисциплинарная категория, изучаемая разными науками. Лингвистическое время многообразно в своем проявлении и передается не только временными формами глагола, но и неглагольными средствами. В каждой части речи имеются лексико-семантические группы с темпоральным значением. При исследовании неглагольных средств выражения временных отношений целесообразно выделить три плана анализа: план содержания, план выражения и план функционирования. В ...
Added: November 7, 2012
СПб.: Издательство Нестор-История, 2018
The volume is the third issue of a corpora-based grammar of Russian. The volume deals with the issues of parts of speech and, more generally, with formal classes of lexicon, It comprises descriptive papers of separate POS and lesser world classes. ...
Added: November 4, 2018
Сибирский филологический журнал 2019 № 2 С. 189-215
The work deals with the strategies for predicate agreement to quantified noun groups headed by nouns. In Russian, as in other Slavic languages, predicate agreement with quantified noun phrases allows singular or plural forms of the predicate. As for the sentences with quantifiers-nouns r’ad, polovina, chast’, mnozestvo, three agreement strategy are probable: predicate agrees with ...
Added: September 8, 2019
Zeitschrift für Slavische Philologie 2016 Т. 72 № 2 С. 255-269
The law “On the state language of the Russian Federation,” adopted in 2005, became a reflection of debates about the Russian language at the beginning of the twenty-first century and caused frustration in both conservative and liberal segments of society. On multiple occasions, attempts were made to amend the law or enact additional laws aimed ...
Added: October 22, 2017
Acta Linguistica Petropolitana. Труды института лингвистических исследований 2016 Т. XII № 1 С. 336-363
The paper presents a corpus-driven study of the Russian PP-based degree modifier do uzhasa (lit. ‘to horror’), suggesting a two-stage grammaticalization path. The first stage (presumably, XVIII–XIX c.) involves subjectification, while during the second stage, subjective readings give rise to intensifier readings through conceptual metonymy. Both stages see a host class expansion. This process is ...
Added: November 27, 2017
Вестник Томского государственного университета 2015
The paper deals with the morphosyntactic and stylistic properties of the Russian verb SUNUT’ and argues for their semantic motivation. SUNUT’ is usually considered as one of “putting verbs” (denoting change of location), but it has some peculiarities in its syntax, derivational patterns, semantics and stylistics. Unlike other verbs of this taxonomic class, SUNUT’ profiles ...
Added: June 2, 2015
Изменения согласных по месту и способу образования на стыках слов в некоторых двухфонемных сочетаниях
Известия Юго-Западного государственного университета. Серия: Лингвистика и педагогика 2015 № 2 С. 78-88
The paper contains the results of phonetic experiment concerning the changes of place and manner of articulation in four biphonemic consonant clusters in Modern Standard Russian. The rules for assimilation and coarticulation of consonants are reviewed in the article, application of these rules in internal and external sandhi positions is compared. ...
Added: October 15, 2016
Acta Linguistica Petropolitana. Труды института лингвистических исследований 2015 Т. 11 № 1 С. 565-584
Requests and commands expressed by the imperative verb forms appear during the earliest period of language acquisition. For some verbs the imperative may be the first form to be acquired and appears to be the initial step towards the acquisition of the full paradigm. In this article the typical imperative contexts of the child-adult communication ...
Added: October 29, 2016
Вестник Тверского государственного университета. Серия: Филология 2012 Т. 10 № 2 С. 21-28
«Bankruptcy» Concept Within the Legal Linguistics Coordinates: Russian–English–French Approximations The article addresses the notion of bankruptcy as perceived by speakers of current Russian, English and French languages both lawyers and participants in professional communication from other trades. Semantic structure of the term is identified based on its lexicographic and regulatory definitions. ...
Added: October 4, 2012
International Journal of Bilingualism 2021 Vol. 25 No. 1 P. 338-358
Aims and objectives: In Dagestan, Russian is the language of education, urban way of life, and upward social mobility, and the means of communication between speakers of different languages. This is a result of a quick and drastic change. At the end of the 19th century, Russian was spoken by less than 1% of the population. ...
Added: October 14, 2020
Granada: Jizo Ediciones, 2011
Материалы конференции содержат полные тексты докладов по темам «Русско-испанские сопоставительные исследования», «Образ России и Испании в литературе, истории и культурологии», «Русский испанский языки в теории и практике перевода». ...
Added: February 26, 2013
Типология морфосинтаксических параметров 2018 Т. 1 № 1 С. 136-150
The paper aims to further advance the line of research where all uses of the Russian subordinator poka are given a unified semantics, and all constructions involving it are assumed to be fully compositional. I focus on several further issues, viz. whether it is simultaneity or precedence that lies in the core of the meaning ...
Added: February 19, 2019
Linguistic Typology 2017 Vol. 21 No. 3 P. 387-456
The category of person has both inflectional and lexical aspects, and the distinction provides a finely graduated grammatical trait, relatively stable in both families and areas, and revealing for both typology and linguistic geography. Inflectional behavior includes reference to speech-act roles, indexation of arguments, discreteness from other categories such as number or gender, assignment and/or placement in syntax, arrangement in ...
Added: November 14, 2017
СПб.: Златоуст, 2016
This book is a collection of papers written by Russian and foreign linguists to highlight the different aspects of bilingualism. Much attention is paid to the early simultaneous and successive bilingualism in children; however, adults speaking several languages in natural settings as well as in classroom are also considered. Some chapters are concentrated on language attrition — an ...
Added: October 2, 2016
О глубинной семантике и функционально-референциальных особенностях адъективных показателей предшествования во французском и русском языках
Альманах современной науки и образования 2010 № 12 С. 226-231
Added: November 23, 2012
О псевдоидентификации в русском языке (на примере обозначений человека в русских литературных текстах)
Вестник Новосибирского государственного университета. Серия: Лингвистика и межкультурная коммуникация 2020 Т. 18 № 2 С. 5-12
There is no way to identify an animate object other than to describe its specific characteristics which necessarily look like deviations from the normal “average” pattern, named here paragon, in which the Axiological Standard of a human group is fixed. Of particular heuristic interest is, in this regard, the logical pattern, often used in Russian for describing such ...
Added: July 28, 2020
Филологические науки. Вопросы теории и практики 2015 Т. 1 № 7 С. 56-63
The article presents the results of the experimental study of the phonetic realization of combinations of two homorganic plosive sounds at the junction of phonetic words in the modern Russian language. The conducted experiment shows that in the considered positions as a result of the co-articulation rules a single plosive is formed, having subject to ...
Added: October 14, 2016
М.: Флинта, 2021
В коллективной монографии «Русский язык в интернет-коммуникации: лингвокогнитивный и прагматический аспекты» анализируется современное состояние русского языка в Интернете, дается лингвокультурологическая интерпретация активных процессов в интернет-коммуникации через призму лингвокогнитивного и лингвопрагматического подходов ...
Added: April 13, 2021
Механизм семантического калькирования и его роль в восполнении дефектных парадигм числа абстрактных существительных в современном русском языке
Вестник Санкт-Петербургского университета. Серия 9. Филология. Востоковедение. Журналистика 2015 № 2 С. 96-104
The paper analyses the criteria for determining semantic calques in modern Russian based on typical examples. The analysis shows that a semantic calque can be clearly attested only when there is a pre-established translation correspondence between the words of the source language and the target language. Special attention is paid to the examples of semantic ...
Added: October 4, 2015