This paper reports upon the study of the lexico-grammatical distribution of Russian matrix predicates selecting kakoj remarkable clauses (or so-called ‘embedded’ exclamatives) in the Russian National Corpus, with some cross-linguistic parallels. It reveals that Russian matrix predicates belong to four conceptual classes: perceptual, mental, emotive, and speech. It shows that the phenomenon of ‘embedded’ exclamatives is irregular because: (1) matrix predicates seem to be lexically idiosyncratic and (2) the most frequent forms of matrix predicates (except for optatives) are on the way to be grammaticalized. The paper also suggests accounting for the observed distribution of predicates in terms of the Gricean maxims of conversation.
Four electronic corpora created in 2011 within the framework of the “Corpus Linguistics: the Albanian, Kalmyk, Lezgian, and Ossetic Languages” Program of Fundamental Research of the RAS are presented. The interface and functionalities of these corpora are described, engineering problems to be solved in their creation are elucidated, and the promises of their development are discussed. A particular emphasis is made on the compilation of dictionaries and automatic grammatical markup of the corpora.
This paper is toward the system of automatic text summarization developed by «DC – Systems» company in cooperation with the faculty of computer science at HSE. The summary is a concise description of the text in terms of its content and meaning, i.e. from the point of view of its semantics. The purpose of the summarization is to reduce the text as much as possible while maintaining the main content. A summary in this article is built using syntactically correlated word combinations. In this case, the possible additional meanings of separate fragments of the text are neglected. The quality of the summary is evaluated by a matching to the source text in terms of semantics.
The main problem is split into two parts: an evaluation of the whole text semantics, without subdivision into parts, and the text transformation to derive an annotation.
The architecture of the developed system and the main algorithm are described. An example of summary derived by the system and its quality evaluation has been provided. The current version of the system has following restrictions: it does not permit any formulas and special signs.
The volume is the third issue of a corpora-based grammar of Russian. The volume deals with the issues of parts of speech and, more generally, with formal classes of lexicon, It comprises descriptive papers of separate POS and lesser world classes.
The “Taiga” project unites the corpus and the syntactic parser, being created in a new field of the corpus linguistics: the material obtained primarily meets the needs of machine learning, rather than linguistic search. The authors consider in detail the methodology for constructing the corpus, balance, volume and composition of its’ segments, format and quality of tagging — which meets the current requirements for the development of tools for processing Russian language. Within the framework of the project, the creation of a large and open-source syntactic corpus in the Universal dependencies format is planned
The paper is focused on the study of reaction of italian literature critics on the publication of the Boris Pasternak's novel "Doctor Jivago". The analysys of the book ""Doctor Jivago", Pasternak, 1958, Italy" (published in Russian language in "Reka vremen", 2012, in Moscow) is given. The papers of italian writers, critics and historians of literature, who reacted immediately upon the publication of the novel (A. Moravia, I. Calvino, F.Fortini, C. Cassola, C. Salinari ecc.) are studied and analised.
In the article the patterns of the realization of emotional utterances in dialogic and monologic speech are described. The author pays special attention to the characteristic features of the speech of a speaker feeling psychic tension and to the compositional-pragmatic peculiarities of dialogic and monologic text.