In this article we report some new experiments in the area of words clustering for the Russian language. We introduce a new clustering method that distributes words into classes according to their syntactic relations. We used a large untagged corpus (about 7,2 bln of words) to collect a set of such relations. The corpus was processed using a set of finite state automata that extracts syntactically dependent combinations having explicit structure. These automata were used to process only unambiguous text fragments because of combination of these techniques increases the quality of sampled input data. The modification of group average agglomerative clustering was used to separate words between clusters. The sampled set of clusters was tested using one of the semantic dictionaries of the Russian language. The NMI score calculated in this article is equal to 0.457 and F1-score is 0.607.
«Bankruptcy» Concept Within the Legal Linguistics Coordinates: Russian–English–French Approximations
The article addresses the notion of bankruptcy as perceived by speakers of current Russian, English and French languages both lawyers and participants in professional communication from other trades. Semantic structure of the term is identified based on its lexicographic and regulatory definitions.
These proceedings include papers on subjects from a wide number of areas including theoretical linguistics, translation, computational linguistics, natural language processing, and applied linguistics, focusing on a variety of languages, ranging from familiar Indo-European languages to Mandarin Chinese, Wolof, and Dene Sųɬiné. In order to make the papers available to the wider research community, these proceedings are being published electronically and distributed freely at http://www.meaningtext.net
Pleonastic Constructions In English Legal Texts
Quite a number of English legal texts, featuring largely contract law, provide linguistic evidence of both terminology, and/or commonly used vocabulary, with semantically identical or related meaning used at a time within the same text sequences. Such constructions appear challenging for taxonomic classification by linguists and lawyers alike. An analysis of examples allows for attributing such usage samples to pleonastic constructions typical for the legal language.
The paper discusses the standardization efforts to create a morphological standard for the Middle Russian corpus, which is part of the historical collection of the Russian National Corpus (RNC). To meet the needs of different categories of corpus researchers as well as NLP developers, we consider two styles of the morphological annotation (RNC schema and Universal Dependencies schema). A number of specifications of the feature list proposed to facilitate data reusability, linking and conversion.
This paper deals with the Semantics/Pragmatics distinction in a contrastive ethnolinguistic aspect. I argue for the validity of this distinction based on cross-linguistic data. My claim is that the specificity of the so-called language key words [Wierzbicka 1990:15-17] - linguospecific items particularly representative of a given language speakersђ mentality - is due to pragmatic rather than semantic peculiarities. These pragmatic peculiarities distinguish the key words both from their synonyms within the same language and their counterparts in other languages. The languages under discussion are Russian and English, analyzed within a combined frame of Integral Language Description model [Apresjan 1995:8-238] and Wierzbickaђs ethnolinguistic approach.
The paper is focused on the study of reaction of italian literature critics on the publication of the Boris Pasternak's novel "Doctor Jivago". The analysys of the book ""Doctor Jivago", Pasternak, 1958, Italy" (published in Russian language in "Reka vremen", 2012, in Moscow) is given. The papers of italian writers, critics and historians of literature, who reacted immediately upon the publication of the novel (A. Moravia, I. Calvino, F.Fortini, C. Cassola, C. Salinari ecc.) are studied and analised.
In the article the patterns of the realization of emotional utterances in dialogic and monologic speech are described. The author pays special attention to the characteristic features of the speech of a speaker feeling psychic tension and to the compositional-pragmatic peculiarities of dialogic and monologic text.