Everyday Conversations: a Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical Level

?

Everyday Conversations: a Comparative Study of Expert Transcriptions and ASR Outputs at a Lexical Level

Lecture Notes in Computer Science. 2023. Vol. 14338. P. 43–56.

Sherstinova T., Михайловский Н. Э., Kolobov R.

The study examines the outcomes of automatic speech recognition (ASR) applied to field recordings of daily Russian speech. Everyday conversations, captured in real-life communicative scenarios, pose quite a complex subject for ASR. This is due to several factors: they can contain speech from a multitude of speakers, the loudness of the conversation partners’ speech signals fluctuates, there’s a substantial volume of overlapping speech from two or more speakers, and significant noise interferences can occur periodically. The presented research compares transcripts of these recordings produced by two recognition systems: the NTR Acoustic Model and OpenAI’s Whisper. These transcripts are then contrasted with expert transcription of the same recordings. The comparison of three frequency lists (the expert transcription, the acoustic model, and Whisper) reveals that each model has its unique characteristics at the lexical level. At the same time, both models perform worse in recognizing the following groups of words typical for spontaneous unprepared dialogues: discursive words, pragmatic markers, backchannel responses, interjections, conversational reduced word forms, and hesitations. These findings aim to foster improvements in ASR systems designed to transcribe conversational speech, such as work meetings and daily life dialogues.

Лингвистический анализ рекламы парфюма в англоязычном и русскоязычном дискурсах

Gabrielova E., Шевякова Ю. С., Вестник Удмуртского университета 2026 Vol. 36 No. 2 P. 344–354

In today's globalized world, the effectiveness of sales and the success of products largely rely on well-crafted advertising texts. Influenced by this factor and the growing competition, advertising continuously evolves, incorporating various linguistic, psychological, and cross-cultural techniques. This study focuses on the linguistic and stylistic analysis of perfume advertising texts within English and Russian discourses, ...

Added: May 25, 2026

On the Curse Formula in Wʿzb’s Inscription (RIÉ 192 B, ll. 5–9)

Bulakh M., Aethiopica 2025 Vol. 28 P. 39–52

The article deals with the curse formula belonging to the sixth-century inscription by an Aksumite king Wʿzb (RIÉ 192 B, ll. 5–9). After summarizing the extant interpretations, the author proposes a new reading and interpretation, arguing that the text under scrutiny follows the same pattern and employs the same rhetoric devices as the curse formulas ...

Added: May 23, 2026

Practicamos el Subjuntivo

Bocharov Y., M.: -, 2025.

This textbook is designed for students improving their Spanish proficiency at levels B1-B2. It consists of five topics and a selection of texts to reinforce them. The first topic covers the morphology of the four tenses (present, perfect, imperfect, subjunctive perfect) and exercises on the formation of forms. The remaining topics are devoted to exploring ...

Added: May 23, 2026

Эстетика аудиовизуальной журналистики. Учебное пособие. 2-е издание

Novikova A., Бережная М. А., Кирия И. В., КноРус, 2026.

The aesthetics of journalism is substantiated as a necessary component in the professional training of specialists in audiovisual media. The factors and trends of historical and current changes in the aesthetics of journalism are presented, and the aesthetic practices of audiovisual journalism are characterized in terms of their social functioning. Criteria for aesthetic evaluation are ...

Added: May 22, 2026

Juxtapositional vs. possessive-like encoding in Russian specificational constructions

Logvinova N., Russian linguistics 2026 Vol. 50 Article 11

This paper presents the first in-depth corpus-based study of a previously overlooked syntactic variation in Russian: the competition between juxtapositional (Nominative) and possessive-like (Genitive) encoding of the second noun (the term) in specificational constructions (e.g., ponjatie čest’ (notion.NOM honor.NOM) vs. ponjatie česti (notion.NOMhonor.GEN) ‘the notion of honor’). While typological research has established cross-linguistic preferences for one encoding strategy over another, intralinguistic variation ...

Added: May 18, 2026

FOCUS ON VOCABULARY Экономика материальных и нематериальных активов: корпусный словарь и ИИ-упражнения по английскому языку

Gorina O. G., Kucherenko S., Larisa K. et al., St. Petersburg: Asterion, 2026.

This textbook is an integrated teaching and learning resource for English for Specific Purposes (ESP) in the field of economics of tangible and intangible assets. Its design employs (i) modern corpus linguistics methods, including frequency analysis and keyword extraction based on authentic texts reflecting current trends in professional discourse, and (ii) artificial intelligence technologies for ...

Added: May 16, 2026

КОГНИТИВНО-АССОЦИАТИВНОЕ ПОЛЕ ОНИМОВ САНКТ-ПЕТЕРБУРГА И ВЕНЫ

Зелинская Ю. Ю., Когнитивные исследования языка 2025 № 4(65) С. 180–186

The article focuses on the study of the onym as a cognitive stimulus that facilitates the decoding of the language of urban space across two ethnic groups. The research is grounded in the analysis of results from an onomastic associative experiment, aimed at identifying the dominant types of associative responses to anthroponyms, oikodonyms, hodonyms, and ...

Added: May 16, 2026

Лично-числовая асимметрия: согласование пассивных миративов в казымском диалекте хантыйского языка

Starchenko A., Toldova S., Типология морфосинтаксических параметров 2023 Т. 6 № 1 С. 130–148

The study focuses on a previously unrecorded model of split agreement in the mirative paradigm in Kazym Khanty. Split agreement is found when comparing active and passive mirative constructions, as well as in a limited set of uses of non-finite forms. In the passive voice, unlike the active voice, the 3rd person is unmarked and the ...

Added: May 14, 2026

Глаголы перемещения веществ в славянских языках

Fedorov D., Jezikoslovni Zapiski 2026 Т. 32 № 1 С. 23–52

This article describes verbs denoting motion of liquid and dry substances in Slavic languages. The research explores how Slavic languages lexicalize different situations within the semantic field of substance motion and identifies the parameters that drive this lexicalization (e.g., type of substance, intensity and quantization of flow, and causation). Adjacent grammatical phenomena such as argument ...

Added: May 13, 2026

Образ женщины сквозь года: диахронический анализ репрезентации женщин в российской агитационной рекламе

Gabrielova E., Максименко О. И., Социальные и гуманитарные науки на Дальнем Востоке 2026 Т. 23 № 1 С. 241–249

The article presents a diachronic analysis of the representation of women in Russian advertising, based on agitation posters from 1917-1990 and social and motivational advertising materials from 2000-2020. The aim of the study is to identify the evolution of verbal and visual strategies for constructing the image of women in the changing socio-political and cultural ...

Added: May 13, 2026

Proceedings of the 9th Student Research Workshop associated with the International Conference Recent Advances in Natural Language Processing

Velichkov B., Nikolova-Koleva I., Slavcheva M., Shumen: INCOMA Ltd, 2025.

The RANLP 2025 Student Research Workshop (RANLPStud’2025) is a special track of the established international conference Recent Advances in Natural Language Processing (RANLP’2025). The RANLPStud is being organised for the 9th time and this year is running in parallel with the other tracks of the main RANLP 2025 conference. The target of RANLPStud’25 is to be a ...

Added: May 12, 2026

«Плоский мир» Т. Пратчетта глазами русскоязычного фандома

Кульков А. Н., Tsvetkova M. V., Вестник Томского государственного университета. Филология 2026 № 100 С. 158–173

Впервые делается попытка рассмотреть особенности фанфикшн как акта продуктивной рецепции, возникшего на основе цикла романов Терри Пратчетта о Плоском мире в России. Проведенный анализ показывает, что прежде всего авторы фанфиков стремятся передать стилистику и комическое начало оригинального цикла Пратчетта, вне зависимости от жанра и формата создаваемых ими произведений. Фикрайтеры наиболее часто обращаются к таким форматам, ...

Added: May 10, 2026

Научно обоснованные образовательные интервенции для развития и улучшения понимания прочитанного у подростков

Логвиненко Т. И., Стрельцова А. В., Otstavnov N. et al., Вопросы образования 2025 № 2 С. 101–141

The aim of this article is to review empirical studies, meta-analyses and systematicreviews on educational interventions for developing and improving reading compre-hension in adolescents, including both typically developing readers and those ex-periencing reading difficulties. We distinguish seven intervention types aimed at im-proving reading comprehension, each targeting different components as the basisfor intervention: decoding and reading ...

Added: December 11, 2025

От вина до самогона: топика пьянства в студенческих песнях

Воробьев В. А., В кн.: Толока: сборник статей к 60-летию А.Б. Мороза.: М.: РГГУ, 2025. С. 127–152.

The topic of drunkenness plays a significant role in student songs and is expressed through specific vocabulary, primarily the names of alcoholic beverages. The article examines a group of over 400 occurrences in three corpora (more than 500 texts) in comparison with the social and historical-cultural context of the songs’ existence. The analysis focuses on the ...

Added: October 9, 2025

Comparative Analysis of Encoder-Based NER and Large Language Models for Skill Extraction from Russian Job Vacancies

Matkin N., Smirnov A., Usanin M. et al., , in: 12th International Conference, AIST 2024, Bishkek, Kyrgyzstan, October 17–19, 2024, Revised Selected Papers.: Cham: Springer, 2025.

The labor market is undergoing rapid changes, with increasing demands on job seekers and a surge in job openings. Identifying essential skills and competencies from job descriptions is challenging due to varying employer requirements and the omission of key skills. This study addresses these challenges by comparing traditional Named Entity Recognition (NER) methods based on ...

Added: July 26, 2025

Kak že kak že! Russian discourse formula of confirmation as a marker of recognition

Ekaterina Rakhilina, Bychkova P., , in: Constructions with lexical repetitions in East Slavic.: De Gruyter Mouton, 2024. P. 197–222.

The chapter presents a case study of the repetition mechanism within the development of discourse formulae, i.e., multi-word formulaic replies similar to yes and no. It closely examines the process of pragmaticalization in the Russian formula Kak že! (‘how part’) and its duplicated counterpart. The diachronic corpus data shows that the formula Kak že! emerged ...

Added: February 13, 2025

INCORPORATING INTERJECTIONS TO FACILITATE CONVERSATIONAL FLOW

Rodomanchenko A., , in: Teaching English in Global Contexts, Language, Learners and Learning.: Электронная публикация, 2023. P. 199–211.

Have you ever been in a situation where you lost your train of thought because of being asked a question mid-talk or were distracted by a side comment? Probably, like others, you struggled to get back on track. Although such interruptions are part of authentic conversations, they are rarely addressed in English classes. In this ...

Added: February 8, 2024

The Function of Metacommunicative Markers in Russian-Speaking Communication (a Sociolinguistic Aspect)

T.I. Popova, Communication studies 2021 Vol. 8 No. 3 P. 454–464

The article considers the use of metacommunicative pragmatic markers in the gender aspect, taking into account the social roles of the speaker. The research is carried out on the data of ORD corpus Russian Everyday Speech known as “One Speaker’s Day” corpus, based on transcripts of audio recordings obtained under actual conditions. The volume of ...

Added: October 16, 2023

Лексикология английского языка

Киселева С. В., Кононова И. В., Trofimova N., St. Petersburg: ., 2022.

This textbook is intended for students studying in the bachelor's degree program "Linguistics" and preparing for the exam in the discipline "Fundamentals of the theory of the first foreign language". The manual aims to give students an idea of the specifics of the vocabulary of the modern English language, the origin of words, the problems of the meaning of ...

Added: April 9, 2023

COVID-19 as a Linguistic Phenomenon and its Influence on the Development of Modern Regional Terminology

Pesina S., Kiseleva S., Nella A. Trofimova et al., Journal of Pharmaceutical Negative Results 2022 Vol. 13 No. S8 P. 2985–2991

The article is devoted to the study of COVID-19 as a linguistic phenomenon based on the material of the Russian and English languages, as well as the impact of the pandemic on the vocabulary of two languages. The article examines the influence of the course of the coronavirus pandemic on the meaning of neologisms of ...

Added: December 11, 2022

Pragmatic Markers of Russian Everyday Speech: Invariants in Dialogue and Monologue

Bogdanova-Beglarian N., Blinova O. V., Sherstinova T. et al., , in: Speech and Computer. 23rd International Conference, SPECOM 2021, St. Petersburg, Russia, September 27–30, 2021Vol. 12997.: St. Petersburg: Springer, 2021. P. 81–90.

The paper presents the distribution of pragmatic markers (PM) of Russian everyday speech in two types of discourse: dialogical and monologic. PMs are an essential part of any oral discourse, therefore, quantitative data on their distribution are necessary for solving both theoretical and practical tasks related to studies of speech communication, as well as for ...

Added: October 31, 2021

A Grammar of May: An Austroasiatic Language of Vietnam

Babaev K., Samarina I., Brill, 2021.

Not only is May otherwise undescribed in writing, but it is also the only small Vietic language documented and analysed in such detail, and one of few endangered Austroasiatic languages described so thoroughly. May is predominantly monosyllabic, yet retains traces of affixes and consonant clusters that reflect older disyllabic forms. It is tonal, and also manifests ...

Added: June 24, 2021

Using TXM Platform for Research on Language Changes over Time: The Dynamics of Vocabulary and Punctuation in Russian Literary Texts

Lavrentiev A. M., Sherstinova T., Chepovskiy A. et al., Vestnik Tomskogo Gosudarstvennogo Universiteta, Filologiya 2021 Vol. 70 P. 69–89

The purpose of this paper is to test the methodological tools provided by TXM platform for research on dynamics of vocabulary and punctuation marks in diachronic corpora. TXM is a powerful text analysis software which provides both quantitative and qualitative features in a transparent open-source implementation. In this paper, we demonstrate how it can be ...

Added: June 24, 2021

Pragmatic markers in the aspect of communicative alignment

Трощенкова Е. В., Blinova O. V., Вестник Волгоградского государственного университета. Серия 2: Языкознание 2020 Vol. 19 No. 3 P. 49–58

The article presents a model of communicative alignment in pragmatic markers (PM) use in Russian everyday dialogical communication. The main objectives are to check whether speakers coordinate their linguistic behavior not just with the use of lexemes or grammar forms or constructions, but also with PMs and how this actually works. We suppose that the ...

Added: November 1, 2020