?
Анализ ошибок морфологического анализатора MyStem при работе с записями детской речи
Some of the important conditions of the effectiveness of morphological analyzers are the correct recognition of unfamiliar words and successful morphological disambiguation. In this work, we evaluated the results of automatic processing of children’s spontaneous speech using the morphological analyzer MyStem. We analyzed the longitudinal spontaneous speech recordings of two bilingual children and their parents created according to the CHILDES protocol. The total length of the recordings was 956 minutes and 420 minutes, respectively. The analysis included 12,828 lines from the transcripts tagged by the parser. Based on the results of the research, we were able to determine the frequency of cases with morphological ambiguity and morphological analyzer errors, and we furthermore suggest a typology of such errors and some possible ways of improving the work of the MyStem parser.