Length of East Caucasian subject indexes: a quantative research
In this article I present a connection between frequency and length of person-number indexes via two independent researches: token frequency obtained from the Universal Dependencies’ treebanks and type frequency gathered within a typological study. After introducing the results of those two studies, I will present East Caucasian data. I show that the unusual history of person-number indexes in these languages leads to violations of the tendencies.
The volume includes proceedings of the 23th Scandianvian Conference of Linguistics (SCL 23) that was held at Uppsala University 1–3 October 2008. It includes studies covering a wide spectrum of approaches to linguistics, for example, cross-linguistic typological studies, linguistic variation and language change in contact situations as well as studies relating to bilingualism and to second and foreign language learning.
The paper discusses the standardization efforts to create a morphological standard for the Middle Russian corpus, which is part of the historical collection of the Russian National Corpus (RNC). To meet the needs of different categories of corpus researchers as well as NLP developers, we consider two styles of the morphological annotation (RNC schema and Universal Dependencies schema). A number of specifications of the feature list proposed to facilitate data reusability, linking and conversion.
The starting point of the study is the hypothesis of a discursive proximity of Church Slavonic and Christian religious discourse of the modern Russian language. Analysing lexical structure with quantitative corpus methods we show that the latter is closer to Church Slavonic than the mainstream modern Russian language. This can serve as a proof of the specificity of the register in question, an additional argument when deciding on its separate status. Research is based on the material of the Russian National Corpus, namely, the Church-Slavonic corpus, the Main corpus and the Subcorpus of church-and-theologу texts. Using the log-likelihood criterion and PCA visualizations, we reveal the body of lexemes in Russian texts that can be considered Slavonicisms (tserkovnoslavyanizmy) and show that the "distance" between the corpora can be measured differently if one takes into account adjectives, nouns and verbs separately.
This book is a collection of articles dealing with various aspects of grammatical relations and argument structure in the languages of Europe and North and Central Asia (LENCA). Topics covered with respect to individual languages are: split-intransitivity (Basque), causativization (Agul), transitives and causatives (Korean and Japanese), aspectual domain and quantification (Finnish and Udmurt), head-marking principles (Athabaskan languages), and pragmatics (Eastern Khanty and Xibe). Typology of argument-structure properties of ‘give’ (LENCA), typology of agreement systems, asymmetry in argument structure, typology of the Amdo Sprachbund, spatial realtors (Northeastern Turkic), core argument patterns (languages of Northern California), and typology of grammatical relations (LENCA) are the topics of articles based on cross-linguistic data. The broad empirical sweep and the fine-tuned theoretical analysis highlight the central role of argument structure and grammatical relations with respect to a plethora of linguistic phenomena.
Mehweb Dargwa features a particle gwa, a peculiar element which is basically used for emphasizing the assertion. The paper explores some grammatical characteristics of this particle. It is shown that, in both verbal and non-verbal clauses, gwa serves as a predicative marker forming a complete predication and is an equivalent of a copula (even though, unlike the neutral copula in Mehweb, it lacks inflection). Similarly to typical East Caucasian predicative markers, gwa may occur in different positions, though its place is syntactically constrained (e.g., it cannot be embedded within syntactic islands). Still, Mehweb speakers allow gwa not to be adjoined to either the predicate or the focus. This makes the distribution of the particle surprising as compared with similar predicative markers in well-described East Caucasian languages, where they may either occur on the predicate or immediately follow the focused element.
In my paper, two approaches to verb classification in Adyghe, a language of the West Caucasian family, are discussed. The first approach is a purely morphological classification based on the choice of person cross-referencing prefixes. The second one is a derivational classification which builds on the morphological mechanisms of reciprocalization and reflexivization. The main research question which lies behind the classification study is whether verbs derived by means of the reciprocal or reflexive marker behave in the course of further valency-changing operations differently from nonderived verbs.
I show that verb classification in Adyghe has some typologically peculiar properties, the main one being that the derivational classification distinguishes more specific classes than the purely morphological one. In other words, the fact that a verb is derived is crucial for its behavior. The language-specific properties of Adyghe are also typologically relevant. They show that derived verbs and derivational mechanisms are of particular relevance in verb classification and should be given more attention in linguistic work on verb classification than is currently done.