On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

?

On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

2021.

Sokolov A.

Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the SER systems. Further, more clarification is required for analysing the impact of ASR's word error rate (WER) on linguistic emotion recognition per se and in the context of fusion with acoustic information exploitation in the age of deep ASR systems. In order to tackle the above issues, we create transcripts from the original speech by applying three modern ASR systems, including an end-to-end model trained with recurrent neural network-transducer loss, a model with connectionist temporal classification loss, and a wav2vec framework for self-supervised learning. Afterwards, we use pre-trained textual models to extract text representations from the ASR outputs and the gold standard. For extraction and learning of acoustic speech features, we utilise openSMILE, openXBoW, DeepSpectrum, and auDeep. Finally, we conduct decision-level fusion on both information streams -- acoustics and linguistics. Using the best development configuration, we achieve state-of-the-art unweighted average recall values of 73.6% and 73.8% on the speaker-independent development and test partitions of IEMOCAP, respectively.

Research target: Computer Science

Priority areas: IT and mathematics

Language: English

Full text

Text on another site

On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

Shuranov E., / Series Computer Science "arxiv.org". 2021.

Added: February 14, 2023

Применение алгоритмов машинного обучения при решении задач информационной безопасности

Nazarov A., Виноградов Ю. В., Сычев А. К., Системы высокой доступности 2018 Т. 14 № 4 С. 20–22

The article studies the use of machine learning algorithms in solving information security problems, namely, in the construction of next-generation intrusion detection systems (IDS). The main drawbacks of traditional IDS (based on signature rules) are considered and methods for their solution are proposed using the algorithms of machine learning. The article presents new methods of ...

Added: February 26, 2019

Advances in Computational Intelligence. IWANN 2019

Berlin: Springer, 2019.

This two-volume set LNCS 10305 and LNCS 10306 constitutes the refereed proceedings of the 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, held at Gran Canaria, Spain, in June 2019. The 150 revised full papers presented in this two-volume set were carefully reviewed and selected from 210 submissions. The papers are organized in topical sections ...

Added: July 29, 2019

Fuzzy Phonetic Encoding of Speech Signals in Voice Processing Systems

Savchenko L.V., Savchenko A.V., Journal of Communications Technology and Electronics 2019 Vol. 64 No. 3 P. 238–244

In this paper, we studied the phonetic approach for voice processing. A method for automatic recognition of speech signals, in which each quasistationary segment is associated with a fuzzy set of phonemes, was developed. We proposed the operation of the probabilistic triangular norm for fuzzy sets corresponding to the input frame and the nearest reference phoneme. The developed ...

Added: June 7, 2019

Mechanistic Permutability: Match Features Across Layers

Balagansky N., Ian Maksimov, Daniil Gavrilov, / Series Computer Science "arxiv.org". 2024.

Understanding how features evolve across layers in deep neural networks is a fundamental challenge in mechanistic interpretability, particularly due to polysemanticity and feature superposition. While Sparse Autoencoders (SAEs) have been used to extract interpretable features from individual layers, aligning these features across layers has remained an open problem. In this paper, we introduce SAE Match, ...

Added: February 20, 2025

Proceedings of the 6th International Conference on Learning Representations (ICLR 2018)

[б.и.], 2018.

Proceedings of the 6th International Conference on Learning Representations (ICLR 2018) ...

Added: October 29, 2018

ПРИМЕНЕНИЕ ГЛУБОКИХ НЕЙРОННЫХ СЕТЕЙ ДЛЯ КЛАССИФИКАЦИИ БОЛЬШИХ ОБЪЕМОВ АСТРОНОМИЧЕСКИХ ДАННЫХ

Gorbunov A. A., Isaev E., Samodurov V., Radio Physics and Radio Astronomy 2017 Т. 22 № 4 С. 270–275

In the process of astronomical observations are collected vast amounts of data. BSA (Big Scanning Antenna) LPI used in the study of impulse phenomena, daily logs 87.5 GB of data (32 TB per year). Experts classified 83096 individual observations (on the segment of the study July 2012 - October 2013). Over 75% of the sample ...

Added: October 15, 2017

Probabilistic Neural Network With Complex Exponential Activation Functions in Image Recognition

Savchenko A., IEEE Transactions on Neural Networks and Learning Systems 2020 Vol. 31 No. 2 P. 651–660

If the training data set in image recognition task is not very large, the feature extraction with a convolutional neural network is usually applied. Here, we focus on the nonparametric classification of extracted feature vectors using the probabilistic neural network (PNN). The latter is characterized by the high runtime and memory space complexity. We propose ...

Added: November 1, 2019

Моделирование сетей на кристалле на основе регулярных и квазиоптимальных топологий с помощью симулятора OCNS

Romanov A., Tumkovskiy S., Иванова Г. А., Вестник РГРТУ 2015 Т. 2 № 52 С. 61–66

A review of the networks-on-chip modeling methods is given. A high-level model of networks-on-chip based on the programming language Java, which helps to accelerate the modeling process by several orders, compared to HDL‑models is developed. The results of simulation of networks-on-chip based on regular and quasi-optimal topologies with the number of nodes up to 100 ...

Added: June 21, 2015

Agent-based modelling of interactions between air pollutants and greenery using a case study of Yerevan, Armenia

Akopov A. S., Beklaryan L. A., Saghatelyan A. K., Environmental Modelling and Software 2019 Vol. 116 P. 7–25

Urban greenery such as trees can effectively reduce air pollution in a natural and eco-friendly way. However, how to spatially locate and arrange greenery in an optimal way remains as a challenging task. We developed an agent-based model of air pollution dynamics to support the optimal allocation and configuration of tree clusters in a city. The Pareto ...

Added: February 24, 2019

Использование веб-камер в качестве источника стереопар

Protasov S., Кургалин С. Д., Крыловецкий А. А., Вестник Воронежского государственного университета. Серия: Системный анализ и информационные технологии 2011 № 2 С. 80–86

Задачу формирования стерео-видеопотока в настоящий момент необходимо решать в большом спектре практических приложений. Кроме кино-индустрии, получение и обработка стереоизображений в реальном времени находит применение в промышленности, коммуникации, моделировании и т.д. В данной статье рассматривается подход к созданию гибкой системы захвата стерео-видеопотока на базе web-камер, которая может быть интегрирована в компактные персональные устройства. Текст статьи Аннотация на сайте издания ...

Added: February 12, 2013

Intelligent Network Security Monitoring Based on Optimum-Path Forest Clustering

Guimarães R. R., Passos L., Filho R. H. et al., IEEE Network 2019 Vol. 33 No. 2 P. 126–131

Distinguishing outliers from normal data in wireless sensor networks has been a big challenge in the anomaly detection domain, mostly due to the nature of the anomalies, such as software or hardware failures, reading errors or malicious attacks, just to name a few. In this article, we introduce an anomaly detection-based OPF classifier in the ...

Added: December 19, 2018

Probably approximately correct learning of Horn envelopes from queries

Borchmann D., Hanika T., Obiedkov S., Discrete Applied Mathematics 2020 Vol. 273 P. 30–42

We propose an algorithm for learning the Horn envelope of an arbitrary domain using an expert, or an oracle, capable of answering certain types of queries about this domain. Attribute exploration from formal concept analysis is a procedure that solves this problem, but the number of queries it may ask is exponential in the size ...

Added: October 29, 2019

Proceedings of 11th Industrial Conference on Data Mining (ICDM 2012)

Springer, 2012.

Added: January 29, 2013

Микроэлектроника и информатика – 2013. Тезисы докладов

Зеленоград: МИЭТ, 2013.

В сборнике тезисов докладов 20-й Всероссийской межвузовской научно-технической конференции "Микроэлектроника и информатика 2013", которая проводится в год 55-летия образования г. Зеленограда, признанного в стране и мире центра микроэлектроники и нанотехнологий, представлены результаты научных исследований студентов, аспирантов и молодых ученых зеленоградских предприятий и вузов России по следующим приоритетным направлениям развития науки и техники микро- и наноэлектроника, ...

Added: May 31, 2013

CEUR Workshop Proceedings. Proceedings of the International Workshop on Social Network Analysis using Formal Concept Analysis (SNAFCA 2015)

Malaga: CEUR Workshop Proceedings, 2015.

Social network analysis (SNA) is a multidisciplinary research area that has attracted many researchers from different disciplines such as Physics, Mathematics, Sociology, Biology and Computer Science, and has been studied according to different approaches and techniques. A social network is a dynamic structure (generally represented as a graph) of a set of entities/actors (nodes) together ...

Added: October 19, 2015

Сборник трудов конференции NI Academic Days 2017, Москва 13-14 апреля 2017 г.

М.: National Instruments Russia, 2017.

Содержание сборника составляют доклады с результатами оригинальных исследований и технических решений, ранее не публиковавшиеся. Мы надеемся, что предлагаемый сборник окажется полезным для специалистов, работающих в различных областях науки и техники, для широкого круга преподавателей, аспирантов и студентов ВУЗов, а также для преподавателей средних школ и технических колледжей. ...

Added: May 10, 2017

Компьютерный синтез и моделирование наноструктур бистабильных ячеек для матриц памяти с повышенной информационной плотностью.

Trubochkina N. K., Качество. Инновации. Образование 2014 № 9 С. 43–53

Approach to creating a memory array constructed on two different algorithms to provide basic memory - R-trigger in transition circuitry is described. The results of a successful computer simulations for two one-layer nanostructures for memory arrays with high information density are given. The fundamental importance is the implementation of a single-layer nanostructures storage elements, which ...

Added: March 2, 2015

Программирование в операционной среде UNIX: обмен информацией между параллельными процессами, организация защиты файлов в файловой системе, обработка прерываний

Istratov A., М.: РГУИТП, 2006.

Рассматриваются аспекты системного программирования в среде UNIX-подобных операционных систем ...

Added: February 8, 2013

The complexity of the 3-colorability problem in the absence of a pair of small forbidden induced subgraphs

Malyshev D., Discrete Mathematics 2015 Vol. 338 No. 11 P. 1860–1865

We completely determine the complexity status of the 3-colorability problem for hereditary graph classes defined by two forbidden induced subgraphs with at most five vertices. ...

Added: April 7, 2014

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва, 29 мая — 1 июня 2019 г.). Вып. 18 (25)

М.: Издательский центр «Российский государственный гуманитарный университет», 2019.

Сборник включает 27 докладов международной конференции по компьютерной лингвистике и интеллектуальным технологиям «Диалог 2019», не вошедшие в ежегодник «Компьютерная лингвистика и интеллектуальные технологии», но рекомендованные Программным Комитетом к представлению на конференции. Для специалистов в области теоретической и прикладной лингвистики и интеллектуальных технологий. ...

Added: December 10, 2019

Formation of Control Structures in Static Swarms

Karpov V. E., Karpova I. P., Procedia Engineering 2015 Vol. 100 P. 1459–1468

Work solutions are proposed for problems of leader definition and role distribution in homogeneous groups of robots. It is shown that transition from a swarm to a collective of robots with hierarchical organization is possible using exclusively local interaction. The local revoting algorithm is central to the procedure for choice of leader while redistribution of roles can ...

Added: March 14, 2015

Improving quality of graph partitioning using multi-level optimization

S. D. Kuznetsov, Turdakov D. Y., Пастухов Р. К. et al., Programming and Computer Software 2015 Vol. 41 No. 5 P. 302–306

Graph partitioning is required for solving tasks on graphs that need to be distributed over disks or computers. This problem is well studied, but the majority of the results on this subject are not suitable for processing graphs with billions of nodes on commodity clusters, since they require shared memory or lowlatency messaging. One of ...

Added: January 23, 2018

Priority Queueing for Packets with Two Characteristics

Chuprikov P., Nikolenko S. I., Davydow A. et al., IEEE Transactions on Networking 2018 Vol. 26 No. 1 P. 342–355

Modern network elements are increasingly required to deal with heterogeneous traffic. Recent works consider processing policies for buffers that hold packets with different processing requirements (number of processing cycles needed before a packet can be transmitted out) but uniform value, aiming to maximize the throughput, i.e., the number of transmitted packets. Other developments deal with ...

Added: March 14, 2018