Domain-independent Classification of automatic Speech Recognition Texts

Publications

?

Domain-independent Classification of automatic Speech Recognition Texts

Mescheryakova E.I., Nesterenko L.V.

Language: English

In book

Computational Linguistics and Intellectual Technologies. International Conference "Dialogue 2017" Proceedings

Vol. 1. Issue 16 (23). , M.: -, 2017.

Bridging Gaps in Russian Language Processing: AI and Everyday Conversations

Tatiana Sherstinova, Nikolay Mikhaylovskiy, Evgenia Kolpashchikova et al., , in: Proceedings of the 35th Conference of Open Innovations Association FRUCT, 24-26 April 2024, Tampere, FinlandIssue 1. FRUCT Oy, 2024. P. 253–258.

Contemporary advancements in NLP and neural network techniques are paving the way to enhance and harness traditional linguistic resources and corpora, as well as expand the methods of applying neural networks for complex language material. Thus, a weak point for both theoretical and applied linguistic tasks is the processing of spontaneous everyday speech. Two experiments ...

Added: November 29, 2024

On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

Shuranov E., / Series Computer Science "arxiv.org". 2021.

Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the SER systems. Further, more clarification is required for analysing the impact of ASR's word error rate (WER) on linguistic emotion ...

Added: February 14, 2023

Писатель Робин Дранаттагор: апробация модели Whisper на русскоязычной звучащей речи

Колпащикова Е. О., Социо- и психолингвистические исследования 2023

Whisper is an acoustic model released by OpenAI about a year ago. Whisper was trained on 680,000 hours of multilingual and multitasking speech, which should improve the model's performance in recognizing accents and make it less sensitive to background noise. The study is devoted to testing the capabilities of Whisper on field audio recordings from ...

Added: December 10, 2023

Voice command recognition in intelligent systems using deep neural networks

Sokolov A., Savchenko A., , in: 17th World Symposium on Applied Machine Intelligence and Informatics (SAMI). IEEE, 2019. Ch. 19 P. 113–116.

In this article, we focus on the isolated voice command recognition for autonomous man-machine and intelligent robotic systems. We propose to create a grammar model for a small testing command set with self-loops for each state to return blank symbols for noise and out-of-vocabulary words. In addition, we use single arc connected beginning and ending ...

Added: October 21, 2019

Artie Bias Corpus: An Open Dataset for Detecting Demographic Bias in Speech Applications

Meyer J., Rauchenstein L., Eisenberg J., , in: Proceedings of The 12th Language Resources and Evaluation ConferenceVol. 12. European Language Resources Association (ELRA), 2020. P. 6462–6468.

We describe the creation of the Artie Bias Corpus, an English dataset of expert-validated <audio, transcript> pairs with demographic tags for age, gender, accent. We also release open software which may be used with the Artie Bias Corpus to detect demographic bias in Automatic Speech Recognition systems, and can be extended to other speech technologies. ...

Added: April 20, 2021

On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

Sokolov A., / Series Computer Science "arxiv.org". 2021.

Added: November 17, 2020

Межкультурное пространство: лингвистический и дидактический аспекты. Часть 2. Материалы секций "Межкультурная лингвистика", "Межкультурная транслатология" и студенческого научного форума. Пленарное заседание и секция «Межкультурная дидактика».

Scherbakova A., Издательство ПетрГУ, 2021.

The paper focuses on the task of clustering essays produced by ESL (English as a Second Language) learners. The data was taken from a learner corpus REALEC. The division of texts by certain characteristics can be useful to speed up the analysis of a single corpus or access to the necessary sections of a large ...

Added: April 30, 2021

Uncertainty Estimation in Autoregressive Structured Prediction

Andrey Malinin, Gales M., , in: Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). ICLR, 2021. ICLR, 2021. P. 1–31.

Added: November 1, 2021

Распознавание речи в корпусе аудиозаписей торговых представителей: проблемы, решения и исследовательские перспективы

Колмогорова П. А., В кн.: Теоретическая семантика и идеографическая лексикография: Словарь. Дискурс. Корпус: коллективная монография. [б.и.], 2023.

The purpose of this research is to examine existing methods of automatic speech recognition for the Russian language and their implementation for marketing communication (speech of sales representatives of a distribution company). The object of study is 1500 recordings of dialogues between sales representatives of a distribution company and their clients (approximately 12 hours and ...

Added: December 10, 2023

Кластеризация данных, извлечение ключевых слов и лексическое разнообразие в текстах эссе учебного корпуса

Scherbakova A., В кн.: Межкультурное пространство: лингвистический и дидактический аспекты. Материалы секций "Межкультурная лингвистика", "Межкультурная транслатология" и студенческого научного форума. Пленарное заседание и секция «Межкультурная дидактика».Ч. 2. Издательство ПетрГУ, 2021.

Added: September 30, 2021

Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets

Ryabinin M., Malinin A., Gales M., , in: Advances in Neural Information Processing Systems 34 (NeurIPS 2021). Curran Associates, Inc., 2021. P. 6023–6035.

Added: October 31, 2021

Fuzzy Phonetic Encoding of Speech Signals in Voice Processing Systems

Savchenko L.V., Savchenko A.V., Journal of Communications Technology and Electronics 2019 Vol. 64 No. 3 P. 238–244

In this paper, we studied the phonetic approach for voice processing. A method for automatic recognition of speech signals, in which each quasistationary segment is associated with a fuzzy set of phonemes, was developed. We proposed the operation of the probabilistic triangular norm for fuzzy sets corresponding to the input frame and the nearest reference phoneme. The developed ...

Added: June 7, 2019

Gender domain adaptation for automatic speech recognition

Sokolov A., Savchenko A., , in: 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI). IEEE, 2021. P. 413–418.

This paper is focused on the finetuning of acoustic models for speaker adaptation goals on a given gender. We pretrained the Transformer baseline model on Librispeech-960 and conducted experiments with finetuning on the gender-specific test subsets. The obtained word error rate (WER) relatively to the baseline is up to 5% and 3% lower on male ...

Added: September 26, 2021