?
The hypothesis of dependence of the lexical nature of mixed languages on the patterns of their emergence
This study investigates mixed languages, with a specific focus on their lexical characteristics. It proposes and substantiates the hypothesis that the degree of lexical mixing in such languages — reflected in the prevalence of doublets and the distribution of vocabulary between source languages — is linked to the specific pattern of their emergence, rather than to the diachronic process of stabilisation over time, as suggested by Auer’s model. The research is grounded in the analysis of core vocabulary using the Swadesh 100-word list, applied to the following mixed languages: Gurindji Kriol, Light Warlpiri, Michif, Media Lengua, Angloromani, and Medny Aleut. Quantitative data on lexical origins and doublet frequencies were collected from existing dictionaries and prior studies, enabling a comparative assessment. Three primary patterns of emergence are identified and analysed. The first, code-switching (observed in Gurindji Kriol and Light Warlpiri), is characterised by a high proportion of doublets and a relatively even distribution of lexicon between the source languages. The second pattern, language intertwining (exemplified by Michif), exhibits a moderate level of doublets and mixed vocabulary, resulting from the combination of the grammatical system of one language (Cree verbs) with the nominal system of another (French nouns). The third pattern, relexification (as in Media Lengua), shows minimal doublets and a clear dominance of one lexifier, involving the massive replacement of native lexical stems with items from an introduced language while retaining the ancestral grammatical frame. Furthermore, the study promotes the concept of mirror-opposite relexification (applicable to Angloromani and Medny Aleut). In this pattern, the ancestral language (Romani, Aleut) provides the bulk of the lexicon, representing a reverse of the Media Lengua model, as discussed by Croft and Chlenov. The findings directly challenge Auer’s diachronic continuum model (code-switching → language mixing → fused lects), which posits that mixed languages gradually reduce variation and stabilise over time. The data reveal no consistent correlation between the age of a language and its lexical variability. For instance, Michif (emerged in the early 1800s) shows a higher degree of lexical mixture and doublets than the younger Medny Aleut (emerged in the late 1800s). Instead, the lexical profile — especially the frequency of doublets — shows a strong and systematic correlation with the sociolinguistic pathway of formation. Code-switching leads to the most lexically mixed outcomes, while relexification (both standard and mirror-opposite) results in the least mixed, lexically dominant profiles. The study concludes that the “mixed nature” of a language, particularly its lexical structure, is fundamentally shaped by its emergence pattern. This pattern, in turn, predicts key features such as the distribution of vocabulary between lexifiers and the prevalence of doublets.