The present Active Dictionary of the Russian Language is an innovative product, the first dictionary of this type in Russian lexicography. It is created on the basis of the latest theoretical achievements in the following areas: a) theoretical linguistics (the principle of lexicon as a system, the principle of integrated linguistic descriptions); b) semantics (fundamental classification of predicates, semantic metalanguage, the theory of definitions, regular polysemy, rules of meanings’ interactions in text); c) syntax (syntactic government as a reflection of semantic valency structure of predicates and non-valenced syntactic properties); d) the theory of collocations (the apparatus of lexical functions and the typology of bound collocations); e) the notion of lexicalized prosody (in particular, phrasal stress). All this information is adapted as to be easily comprehensible to an average Russian speaker without any specialized linguistic knowledge above what is provided by the standard course of Russian language at school. As a product that combines a profoundly motivated theoretical foundation in lexicography with its practical aspect, the present dictionary has no comparable analogues. The first issue includes letters A to G.
The book describes the Dargwa variety spoken in the village of Tanti (Central Daghestan) and consists of a grammatical sketch and a few chapters devoted to specific aspects of grammar. The variety discussed in the vook shows complex systems of nominal and verbal inflection as well as a number of other non-trivial features. The book includes detailed discussion of the structure of the nominal phrase, the clause structure, agreement and various other phenomena and also a small corpus of glossed texts as well as lexical information.
In the book main notions of descriptor theory of metaphors are discussed. Each example of metaphor usage is described as cortege of significative and denotative descriptors. Quantitative analysis of descriptors of both kinds allows to creature models of metaphorical structures of different discourses and their restricted fragments. In the book conceptual apparatus of descriptive theory is presented as well as case studies – examples of application of conceptual apparatus to concrete language areas, i.e. to political discourse and to restricted discourses (language of Gospels and to novels by A.Platonov).
Corpus linguistics can be broadly defined in terms of two partially overlapping research dimensions . On the one hand, corpus linguistics is knowledge of how to compile and annotate linguistic corpora. On the other hand, corpus linguistics is a family of qualitative and quantitative methods of language study based on corpus data. The book presents the first steps taken by Russian corpus linguistics toward the development of language corpora and corpus-based resources as well as their use in grammatical and lexical analysis.
The first part of the book focuses on the annotation of Russian texts at several levels: lemmas, part of speech and inflectional forms, word formation, lexical-semantic classes, syntactic dependencies, semantic roles, frames, and lexical constructions. We discuss various theoretical principles and practical considerations motivating the corpus markup design, provide details on the creation of lexical resources (electronic dictionaries and databases) and text processing software, and consider complicated cases that present challenges for the annotation of corpora both manually and automatically. In most cases we describe the annotation of the Russian National Corpus (RNC, ruscorpora.ru) and its affiliate project FrameBank (framebank.ru).
Frequency data depend not only on the representativeness and balance of texts in a corpus, but also on the rules and tools used for annotation. The book addresses the development of evaluation standards for Russian NLP resources, namely, morphological taggers and dependency parsers. In addition, the book presents several experiments on automatic annotation and disambiguation: lemmatization of word forms not in the dic- tionary; word sense disambiguation based on vectors formed by lexical, semantic and grammatical cues of context; and semantic role labeling.
The final chapters of the first part of the book outline two types of frequency dictionaries based on the RNC data: a general-purpose frequency dictionary and a lexico-grammatical one.
The second part of the book presents an analysis of corpus data and includes a number of case studies of Russian grammar and lexical-grammatical interaction using quantitative methods. The key concept underlying our analysis is the behavioral profile (Hanks 1996; Divjak, Gries 2006), which is the frequency distribution of variable elements in a linguistic unit as attested in a corpus. This covers grammatical profiles (the frequency distribution of inflected forms of a word), constructional profiles (the frequency distri- bution of argument or any other constructions attested for a key predicate), lexical and semantic profiles (the frequency distribution of words and lexical-semantic classes in construction slots or, more generally, in the context of a word), and radial category profiles (the frequency distribution of word senses and word uses across the radial category network of a polysemous unit). We use grammatical, constructional, semantic, and radial category profiling to study tense, aspect and mood specialization of Russian verb forms; to identify singular-oriented and plural-oriented nouns; to investigate factors for prefix choice and prefix variation in natural perfectives (chistovidovye perfectivy); to analyze constraints on the filling of slots in a construction and how this affects the meaning of the construction, taking as an example the Genitive construction of shape and the spatial construction with the preposition poverkh ‘up and over’.
The quantitative corpus-based techniques used for the analysis vary from simple descriptive statistics (e. g., absolute frequencies, percentages, measures of the central ten- dency and outliers) to exact Fisher test and logistic regression. We claim that the vector modeling approaches to quantitative grammatical studies in theoretical linguistics are no less effective than in computational linguistics, where they have become a standard tool.
The 12th volume of the series contains the texts of Novgorod birch-bark documents N 916-1062 unearthed in the course of the excavations of 2001-2014, as well as those found in Staraya Russa (N 37-45). Most of the published documents originate from the Troicky excavation site and are dated to the 12th century. The core of the volume is formed by the documents from the estate Ж, where the concentration of birch-bark letters is significantly higher than at any other medieval Novgorod estate explored so far. Of special importance are two deposits: financial and economic records of Yakim (second half of the 12th century) comprising the largest set of document written by one hand, and correspondence of Luke, Ivan and Snovid (mid 12th century) containing fine examples of Early Rus’ merchants’ correspondence. The texts of the documents are published with comprehensive linguistic and historical commentary. The second part of the book contains corrections to the readings and interpretations of the birch-bark documents published in the previous volumes of the series as well as updating of some of tables of extra-stratigraphic dating published in the 10th volume. The volume also contains a linguistic index and a list of conventional dates of the published documents.
The book addresses the issue of concessives in Russian and analyzes the semantics of Russian concessive expressions. The first chapter explicates the core meaning of all Russian concessives. It is based on the simpler senses of condition and negation. The second chapter defines the main modifications of the core meaning along the lines of semantic conversion and actantial addition, as well as semantic shifts concerning the senses of probability, desirability and degree. The third chapter describes the meanings of more than sixty cocessive items in Russian. The fourth chapter provides corpus analysis of the Russian concessive constructions, in particular, their semantic and combinatorial properties. The fifth and the sixth chapters consider concession among similar lexico-grammatical meanings. The seventh chapter presents lexicographic descriptions of concessives in the Active dictionary of Russian.
«Languages of Africa: an attempt at a lexicostatistical classification» has been planned as a multi-volume monograph that aims at a complete, step-by-step re-evaluation of current hypotheses on the genetic classification of most of the languages, currently or until recently spoken on the African continent. The relevance of this task goes far beyond the current needs and issues of historical linguistics. in recent decades, significant progress has been achieved in recreating the human prehistory of Africa through important discoveries and systematizations in the fields of anthropology, archaeology, and population genetics, allowing for a thorough reassessment of earlier conceptions and beliefs on the subject. At the same time, the general «standard model» for the overall classification of Africaʼs languages, introduced by Joseph Greenberg more than half a century ago, still continues to serve as the default scheme of reference for linguists and non-linguists alike — not so much due to any exceptional robustness, inherent in the principles and methods according to which it was originally constructed, but rather due to a complete lack of a well-grounded alternative. Despite a plethora of new high-quality linguistic material that has been accumulated over the past fifty years, and despite the fact that Greenbergʼs methodology of «multilateral comparison» has been harshly criticized over the same period, leading more and more specialists in the field to doubt or even completely reject most of his «macrofamily» groupings, it remains obvious that, as long as no constructive challenge is presented, Greenbergʼs «quadripartite» scheme, according to which the absolute majority of Africaʼs languages falls into one of the four macrofamilies (Khoisan, Nilo-Saharan, Niger-Kordofanian, or Afro-Asiatic), will remain in active usage — for technical and pragmatic reasons, if nothing else. The third volume in this ongoing series, following the same analytical procedure as the previous two, completes the preliminary historical survey, lexicostatistical analysis, and re-classification (as a new work-in-progress reference model) of all the low-level language groups that had earlier been included into Greenbergʼs alleged «Nilo-Saharan»macrofamily. This task, begun in Volume 2 with the analysis of the single largest building block of Nilo-Saharan (the so-called «Eastern Sudanic» family), is now rounded out with the inclusion of all the other potential constituents of Nilo-Saharan — the large Central Sudanic family (somewhat controversial in itself, since it consists of no less than six distinct members, genetic relations between which have not yet been explored to common satisfaction); the smaller Saharan, Maba, and Koman families; and such «macro-languages» and language isolates as Berta, Kunama, Gumuz, Fur, and Songhay. The survey also includes the small Krongo-Kadugli language group, spoken in the Nuba Mountains, and the small language isolate Shabo in Ethiopia, neither of which were included by Greenberg in the original Nilo- Saharan hypothesis, but both of which came to be regarded by some subsequent researchers as potential members of the macrofamily.