A morphological analyser for Maltese
This article describes the development of a free/open-source morphological description of Maltese, originally created as the analysis component in a rule-based machine translation system for Maltese to Arabic and later applied to other tasks. The lexicon formalism we use is lttoolbox, part of the Apertium machine translation platform. An evaluation of the analyser shows that the coverage is adequate, at 84.90%, while precision is 92.5% on a large automatically annotated test set and 96.2% on a smaller hand-validated set.
The paper discusses two approaches to the automatic lexico-grammatical tagging of the Middle Russian texts (1400–1700), included in the Russian National Corpus (RNC). The task is to assign each token a part of speech label, a tuple of grammatical features, and a lemma (without disambiguation). Middle Russian combines, on the one hand, features of the earlier state of the grammatical system, including aorist and imperfect verb forms, the dual number, a number of archaic inflectional paradigms, and, on the other hand, features of modern Russian inflectional morphology. In lexicon, we can see the same mix of Old Russian and Modern Russian lemmas. Moreover, the texts can contain Church Slavonic and dialectal forms. Absence of a standardised orthography and absence of a standard variant pose even more challenges to processing Middle Russian texts. The first approach is based on writing an electronic dictionary of Old Russian and building a module to handle spelling inconsistency. In the absence of open electronic resources for Middle Russian morphology, an electronic dictionary of Church Slavonic was expanded and adapted to Middle Russian. The paper describes the steps required to change nominal and verbal entries in this dictionary. We follow the principle of «a wider expansion» which presupposes that the analyser is allowed to generate as many annotations as possible so that at least one annotation would be correct. The second approach uses, firstly, an existing Modern Russian tagger supplemented by the module reducing spelling variation, and secondly, a database of lexico-grammatical annotations retrieved from the Diachronic corpus of the RNC. We evaluate the output of both analysers against a manually annotated test data. We also discuss the benchmark scores and outline future prospects for the development of the Middle Russian taggers.
Purpose. This article addresses the question of how to identify product concepts of megacities as complex products that are simultaneously consumed and created by user groups with different preferences and behavioural patterns.
Methodology. We use empirical research to systematize heterogeneous descriptive data about the attributes of city districts and the everyday activities of their residents to further identify the key uses of the places in which they live. Classifiers are used as a tool to systematize city product technologies and uses. This tool was built deductively on the principle of morphological analysis (Zwicky 1969).
Findings. Ten distinctive product concepts of 12 Moscow districts and the city outside them were formulated as distinctive sets of benefits or district uses (needs satisfied and activities encouraged) offered to residents. The concepts yielded represent combinations of seven abstract types.
Originality. The paper proposes a new method for city product analysis which combines the advantages of the standardized (Kotler et. al, 1999) and narrative (Warnaby and Medway, 2013) approaches to place product and brand building. On one hand, this study expresses each of the city product concepts in terms of typical constructions, and, on the other hand, in contrast to previous studies, it fully reflects the distinctive features and specificities of the big city and its districts.
In narrative terms, much uncertainty still exists about the motivational and behavioural distinctions between city users as well as crucial differences in product concepts they share and the distinctive marketing strategies that satisfy them. The present study’s results demonstrate how consistent and inconsistent city use patterns can be identified.
Practical implications. We believe that the analytical procedure that we have developed is a much-needed supplement to the existing techniques that are used when shaping the product strategies of megacities. Identifying contradictory uses helps make product decisions that are appropriate to support all these uses concurrently.
Process models and graphs are commonly used for modeling and visualization of processes. They may represent sets of objects or events linked with each other in some way. Wide use of models in such languages engenders necessity of tools for creating and editing them. This paper describes the model editor which allows for dealing with classical graphs, Petri nets, finite-state machines and their systems. Additionally, the tool has a list of features like simulation of Petri nets, import and export of models in different storage formats. Carassius is a modular tool which can be extended with, for example, new formalisms. In the paper one can find a detailed description of a couple of layout algorithms that can be used for visualizing Petri nets and graphs. Carassius might be useful for educational and research purposes because of its simplicity, range of features and variety of supported notations.
The study of dialects and spoken forms of Arabic is a prominent issue in contemporary Arabistics focussed, among other things, on studying the vernacular specifics of Arabic-speaking societies, as well as the features and prospects of the development of colloquial forms of Arabic. The paper presents a preliminary study of the modern Maltese language (il-Malti) and of its evolution from one of the Arabic dialects to the national language of Malta.
The highly unstable orthography of the Middle Russian texts poses a challenge for their automatic processing. The Middle Russian subcorpus of the Russian National Corpus (RNC) includes documents written mainly between 1400 and 1700, when the variation in spelling was still a norm. The task of lexico-grammatical analysis is to assign a dictionary form (lemma), part of speech and grammatical tags to each word form in the corpus. Traditional methods of pos- and grammatical tagging assume that there can be (almost only) one possible string of characters representing the stem and ending of each grammatical form of the word. Since unstable orthography yields many-to-many mapping between word forms and grammatical annotations, morphological taggers perform poorly and need orthographic normalization preprocessing.
We use both relative and absolute normalization of orthographic representation. The relative normalization involves multiplying orthographic representations of stems and endings in the grammatical dictionary by regular rules. It is carried out at the level of (a) word endings; (b) nominative stems with regular variation, e.g. russk(ij) / russt(ij), keli(ja) / kel'(ja); (c) nominative stems of the Church Slavonic origin, e.g. odin- / edin-; (d) verb stems with prefixes; etc. The absolute normalization matches characters (character combinations) which alternate regularly in the corpus (e.g. o / ѡ 'omega', e / ѣ, шт / щ, жю / жу). The absolute normalization applies to both orthographic representations in the grammatical dictionary and word forms in the text.
The paper is focused on the study of reaction of italian literature critics on the publication of the Boris Pasternak's novel "Doctor Jivago". The analysys of the book ""Doctor Jivago", Pasternak, 1958, Italy" (published in Russian language in "Reka vremen", 2012, in Moscow) is given. The papers of italian writers, critics and historians of literature, who reacted immediately upon the publication of the novel (A. Moravia, I. Calvino, F.Fortini, C. Cassola, C. Salinari ecc.) are studied and analised.
In the article the patterns of the realization of emotional utterances in dialogic and monologic speech are described. The author pays special attention to the characteristic features of the speech of a speaker feeling psychic tension and to the compositional-pragmatic peculiarities of dialogic and monologic text.