?
Speaker-Aware Training of Speech Emotion Classifier with Speaker Recognition
Ch. 55. P. 614–625.
In book
Vol. 12997. , St. Petersburg: Springer, 2021.
Deeb B., Andrey V. Savchenko, Makarov I., IEEE Access 2026 Vol. 13 P. 56283–56295
Speech Emotion Recognition has gained considerable attention in speech processing and machine learning due to its potential applications in human-computer interaction, mental health monitoring, and customer service. However, state-of-the-art models for speech emotion recognition use many parameters, which leads to computational complexity. In this paper, we introduce a novel deep-learning model to enhance the accuracy ...
Added: June 16, 2026
Verkholyak O., Dvoynikova A., Karpov A., Journal of Internet Services and Information Security 2021 No. 1 P. 80–96
This paper presents a novel bimodal speech emotion recognition system based on analysis of acoustic and linguistic information. We propose a novel decision-level fusion strategy that leverages both emotions and sentiments extracted from audio and text transcriptions of extemporaneous speech utterances. We perform experimental study to prove the effectiveness of the proposed methods using emotional ...
Added: April 24, 2026
Deeb B., Savchenko A., Makarov I., , in: ECAI 2024. 27th European Conference on Artificial Intelligence, October 19 – 24 October 2024, Santiago de Compostela, Spain – Including 13th Conference on Prestigious Applications of Intelligent Systems (PAIS 2024).: IOS Press, 2024. P. 4479–4482.
In this paper, we introduce a novel tool for speech emotion recognition, CA-SER, that borrows self-supervised learning to extract semantic speech representations from a pre-trained wav2vec 2.0 model and combine them with spectral audio features to improve speech emotion recognition. Our approach involves a self-attention encoder on MFCC features to capture meaningful patterns in audio ...
Added: February 15, 2025
Churaev E., Savchenko A., Компьютерная оптика 2023 Т. 47 № 5 С. 806–815
In this paper, an approach that can significantly increase the accuracy of facial emotion recogni- tion by adapting the model to the emotions of a particular user (e.g., smartphone owner) is consid- ered. At the first stage, a neural network model, which was previously trained to recognize facial expressions in static photos, is used to ...
Added: May 18, 2023
Sokolov A., / Series Computer Science "arxiv.org". 2021.
Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the SER systems. Further, more clarification is required for analysing the impact of ASR's word error rate (WER) on linguistic emotion ...
Added: November 17, 2020