Адыгейский корпус и орфографическое слово
The article discusses the most recent trends in the development of the English progressive. A corpus-based approach to linguistic research is seen as an effective means of determining reliability of the data retrieved and helps track the major diachronic dynamic in the increasing frequency of the progressive aspect that has taken place since the beginning of the 20th century. The article specifically deals with the extension of the progressive to new constructions, such as modal, present perfect and past perfect passive progressive, and also accounts for the use of progressive forms in the contextual environment not generally characteristic of them.
The paper discusses sociolinguistic implementations of statistical analysis of the spoken subcorpus of the Russian National Corpus. Given the considerable size of the corpus (about 10 mln tokens), an analysis of co-variation of various linguistic parameters with one of the few sociolinguistic parameters available – the speaker’s gender – may give rich and interesting results. One specific example of co-variation is considered in detail: the mean length of the utterance (in tokens). Comparing this parameter in public communication shows statistically significant difference between the speech of men and women (men talk more), while the same difference is absent in private communication. Another important parameter is the gender of the addressee. Again, co-variation is quite different in public and private discourse. In private communication, the utterances are longer when addressing someone of the same sex, the difference between men and women is not statistically significant. In public communication, the utterances are longer when addressing a woman, whether the speaker herself is a man or woman. These conclusions are consistent with the results of sociolinguistic gender studies obtained elsewhere and by other methods. Linguistic difference between men and women are not absolute but depend on the communicative situation (public vs. private). Public discourse is a playground for linguistic competition in which men are the winning party. In private discourse, competition dissolves.
Four electronic corpora created in 2011 within the framework of the “Corpus Linguistics: the Albanian, Kalmyk, Lezgian, and Ossetic Languages” Program of Fundamental Research of the RAS are presented. The interface and functionalities of these corpora are described, engineering problems to be solved in their creation are elucidated, and the promises of their development are discussed. A particular emphasis is made on the compilation of dictionaries and automatic grammatical markup of the corpora.
The topic of this article has to do with the types of interaction between prefixes and suffixes in the morphological structure and semantic interpretation of the extremely complex polysynthetic Adyghe verbal form. As we show, the relationships between the elements of the different parts of the verbal form are both non-trivial and heterogeneous, which suggests that affix interaction can be an important parameter of morphological complexity in languages.
The cases of interaction between prefixes and suffixes vary in three parameters: 1) the direction of restriction (from suffix to prefix or the other way around); 2) semantic relations between suffix and prefix; 3) range of restriction (suffix and prefix impossible without each other; possible, but with an idiomatic meaning, and so on).
Restrictions of the Adyghe type can be an important criterion for grammatical semantics and morphology, since they show which meanings and to which extent are conceptualized as close to each other by the language system.