The Bimodal Corpus of Russian-Turkic Bilinguals’ Speech
T he paper presents Russian-Turkic Bilingual Corpus (RuTuBiC) design, its basic identifying features: the aim of producing a corpus, the types of texts it contains, metatextual markup and error annotation principles, technological (IT, digital) concepts. The current state and development trends of the corpus are discussed. The corpus started as an integral part of a research project intended to explore languages and cultures’ interaction dynamics in South Siberia, it embraces the recordings of Russian-Turkic (Russian-Tatar, Russian-Shor and Russian-Khakass) bilinguals’ oral speech, transcribed and error-annotated. The corpus data allow revealing mother tongue influence within the system of deviations from the speech standard in bilingual speech by means of placing them against various sources of deviations, as well as tracing the influence of social and linguistic factors on the occurrences of deviations from the speech standard.