Real-time Streaming Wave-U-Net with Temporal Convolutions for Multichannel Speech Enhancement

?

Real-time Streaming Wave-U-Net with Temporal Convolutions for Multichannel Speech Enhancement

Cornell University , 2021.

Sokolov A.

In this paper we describe our work that we have done to participate in Task1 of ConferencingSpeech2021 challenge. This task set a goal to develop the solution for multi-channel speech enhancement in a real-time manner. We propose a novel system for streaming speech enhancement. We employ Wave-U-Net architecture with temporal convolutions in encoder and decoder. We incorporate self-attention in decoder to apply attention mask retrieved from skip-connection on features from down-blocks. We explore history cache mechanisms that work like hidden states in recurrent networks and implemented them in proposal solution. It helps us to run inference with chunks length 40 ms and Real-Time Factor 0.4 with the same precision

Research target: Computer Science

Priority areas: IT and mathematics

Language: English

Full text

Text on another site

Fuzzy Analysis and Deep Convolution Neural Networks in Still-to-video Recognition

Savchenko A., Belova N. S., Savchenko Lyudmila V., Optical Memory and Neural Networks (Information Optics) 2018 Vol. 27 No. 1 P. 23-31

We discuss the video classification problem with the matching of feature vectors extracted using deep convolutional neural networks from each frame. We propose the novel recognition method based on representation of each frame as a sequence of fuzzy sets of reference classes whose degrees of membership are defined based on asymptotic distribution of the Kullback–Leibler ...

Added: February 9, 2018

Foundations of Intelligent Systems. 25th International Symposium on Methodologies for Intelligent Systems: ISMIS 2020

Springer, 2020

This book constitutes the proceedings of the 25th International Symposium on Foundations of Intelligent Systems, ISMIS 2020, held in Graz, Austria, in October 2020. The conference was held virtually due to the COVID-19 pandemic. The 35 full and 8 short papers presented in this volume were carefully reviewed and selected from 79 submissions. Included is also ...

Added: October 4, 2020

Intelligent Systems and Applications

Cham : Springer, 2019

Intelligent Systems Conference (IntelliSys) 2018 is the fourth research conference in the series. This conference is a part of SAI conferences being held since 2013. The conference series has featured keynote talks, special sessions, poster presentation, tutorials, workshops, and contributed papers each year. The conference focus on areas of intelligent systems and artificial intelligence (AI) and ...

Added: August 29, 2018

Intelligent Data Processing 11th International Conference, IDP 2016, Barcelona, Spain, October 10–14, 2016, Revised Selected Papers

Switzerland : Springer, 2019

This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016. The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life ...

Added: February 8, 2020

Learning velocity model for complex media with deep convolutional neural networks

Gremyachikh L., Ustyuzhanin A., Станкевич А. et al., / arxiv. Series 2110.08626 "Machine Learning". 2021.

The paper considers the problem of velocity model acquisition for a complex media based on boundary measurements. The acoustic model is used to describe the media. We used an open-source dataset of velocity distributions to compare the presented results with the previous works directly. Forward modeling is performed using the grid-characteristic numerical method. The inverse ...

Added: May 24, 2022

Generalized approach to sentiment analysis of short text messages in natural language processing

Polyakov E. V., Voskov L., Abramov P. et al., Informatsionno-upravliaiushchie sistemy [Information and Control Systems] 2020 No. 1 P. 2-14

Introduction: Sentiment analysis is a complex problem whose solution essentially depends on the context, field of study and amount of text data. Analysis of publications shows that the authors often do not use the full range of possible data transformations and their combinations. Only a part of the transformations is used, limiting the ways to ...

Added: February 20, 2020

Deep learning based methods for estimating distribution of coalescence rates from genome-wide data

Khomutov E., Arzymatov K., Shchur V., Journal of Physics: Conference Series 2021 Vol. 1740 Article 012031

Demographic and population structure inference is one of the most important problems in genomics. Population parameters such as effective population sizes, population split times and migration rates are of high interest both themselves and for many applications, e.g. for genome-wide association studies. Hidden Markov Model (HMM) based methods, such as PSMC, MSMC, coalHMM etc., proved ...

Added: May 17, 2021

Система постановки произношения на основе сверточных нейронных сетей и информационной теории восприятия речи

Savchenko L., Информационные технологии 2019 Т. 25 № 5 С. 313-318

We consider a problem of computer assisted language and pronunciation learning based on the deep learning methods and the information theory of speech perception. In order to improve the efficiency of testing of pronunciation quality, we propose to train a convolutional neural network using the best reference utterances from the user. The experimental results proved ...

Added: May 29, 2019

Deep learning approach for predicting functional Z-DNA regions using omics data

Beknazarov N., Jin S., Poptsova M., Scientific Reports 2020 Vol. 10 P. 19134

Computational methods to predict Z-DNA regions are in high demand to understand the functional role of Z-DNA. The previous state-of-the-art method Z-Hunt is based on statistical mechanical and energy considerations about B- to Z-DNA transition using sequence information. Z-DNA CHiP-seq experiment results showed little overlap with Z-Hunt predictions implying that sequence information only is not ...

Added: December 11, 2020

Workshop on Compact Deep Neural Network Representation with Industrial Applications, Thirty-second Conference on Neural Information Processing Systems

Montréal : [б.и.], 2018

This workshop aims to bring together researchers, educators, practitioners who are interested in techniques as well as applications of making compact and efficient neural network representations. One main theme of the workshop discussion is to build up consensus in this rapidly developed field, and in particular, to establish close connection between researchers in Machine Learning ...

Added: December 5, 2018

Breaking Sticks and Ambiguities with Adaptive Skip-gram

Bartunov S., Vetrov D., Kondrashkin D. et al., Journal of Machine Learning Research 2016 Vol. 51 P. 130-138

The recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed ...

Added: October 1, 2016

Detecting ethnicity-targeted hate speech in Russian social media texts

Pronoza E., Panicheva P., Koltsova O. et al., Information Processing and Management 2021 Vol. 58 No. 6 Article 102674

Ethnicity-targeted hate speech has been widely shown to influence on-the-ground inter-ethnic conflict and violence, especially in such multi-ethnic societies as Russia. Therefore, ethnicity-targeted hate speech detection in user texts is becoming an important task. However, it faces a number of unresolved problems: difficulties of reliable mark-up, informal and indirect ways of expressing negativity in user ...

Added: September 2, 2021

Learning from Metabolic Networks: Current Trends and Future Directions for Precision Medicine

Granata I., Manzo M., Kusumastuti A. et al., Current Medicinal Chemistry 2021 Vol. 28 No. 32 P. 6619-6653

Purpose: Systems biology and network modeling represent, nowadays, the hallmark approaches for the development of predictive and targeted-treatment based precision medicine. The study of health and disease as properties of the human body system allows the understanding of the genotype-phenotype relationship through the definition of molecular interactions and dependencies. In this scenario, metabolism plays a ...

Added: August 25, 2021

Spatially Adaptive Computation Time for Residual Networks

Figurnov M., Collins M., Zhu Y. et al., / Cornell University. Series arXiv "arXiv:1612.02297". 2016.

This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problem-agnostic. It is therefore applicable without any modifications to a wide range of computer vision problems such as image classification, object detection and ...

Added: December 12, 2016

Black-Box Optimization with Local Generative Surrogates

Belavin V., Ustyuzhanin A., Широбоков С. К. et al., Proceedings of Machine Learning Research 2020 P. 1-9

We propose a novel method for gradient-based optimization of black-box simulators using differentiable local surrogate models. In fields such as physics and engineering, many processes are modeled with non-differentiable simulators with intractable likelihoods. Optimization of these forward models is particularly challenging, especially when the simulator is stochastic. To address such cases, we introduce the use ...

Added: October 31, 2019

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Polykovskiy D., Zhebrak A., Sanchez-Lengeling B. et al., Frontiers in Pharmacology 2020 Vol. 11 P. 1-10

Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks. While there are plenty of generative models, it is ...

Added: April 21, 2021

Bayesian Group Sparsification of Long Short-Term Memory Networks

Lobacheva E., Chirkova N., Vetrov D., / undefined. 2018.

We propose a new Bayesian sparsification technique for gated recurrent architectures that encounters for its recurrent specifics and gated mechanism. Our method eliminates neurons from the model and makes gates constant, not only compressing the network, but also significantly accelerating a forward pass. On the discriminative tasks our method compresses LSTM extremely, so that only ...

Added: October 16, 2018

mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing

Soshnikov D. V., Valieva Y., Microsoft Journal of Applied Research, USA 2019 Vol. 12 P. 140-150

In this paper, we present a new Python library called mPyPl, which is intended to simplify complex data processing tasks using a functional approach. This library defines operations on lazy data streams of named dictionaries represented as generators (so-called multi-field datastreams), and allows enriching those data streams with more ’fields’ in the process of data ...

Added: November 20, 2020

Deep convolutional neural networks capabilities for binary classification of polar mesocyclones in satellite mosaics

Криницкий М. А., Verezemskaya P., Гращенков К. В. et al., Atmosphere 2018 Vol. 9 No. 426 P. 1-23

Polar mesocyclones (MCs) are small marine atmospheric vortices. The class of intense MCs, called polar lows, are accompanied by extremely strong surface winds and heat fluxes and thus largely influencing deep ocean water formation in the polar regions. Accurate detection of polar mesocyclones in high-resolution satellite data, while challenging, is a time-consuming task, when performed ...

Added: November 26, 2020

Deep neural networks and maximum likelihood search for approximate nearest neighbor in video-based image recognition

Savchenko A., Optical Memory and Neural Networks (Information Optics) 2017 Vol. 26 No. 2 P. 129-136

We analyzed the way to increase computational efficiency of video-based image recognition methods with matching of high dimensional feature vectors extracted by deep convolutional neural networks. We proposed an algorithm for approximate nearest neighbor search. At the first step, for a given video frame the algorithm verifies a reference image obtained when recognizing the previous ...

Added: June 30, 2017

Algorithmic Minimal Sufficient Statistics: a New Approach

Vereshchagin N., Theory of Computing Systems 2016 Vol. 58 No. 3 P. 463-481

We introduce the notion of a strong sufficient statistic for a given data string. We show that strong sufficient statistics have better properties than just sufficient statistics. We prove that there are “strange” data strings, whose minimal strong sufficient statistic have much larger complexity than the minimal sufficient statistic. ...

Added: February 7, 2017

Probabilistic adaptive computation time

Figurnov M., Sobolev A., Vetrov D., Bulletin of the Polish Academy of Sciences: Technical Sciences 2018 Vol. 66 No. 6 P. 811-820

We present a probabilistic model with discrete latent variables that control the computation time in deep learning models such as ResNets and LSTMs. A prior on the latent variables expresses the preference for faster computation. The amount of computation for an input is determined via amortized maximum a posteriori (MAP) inference. MAP inference is performed ...

Added: February 27, 2019

mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing

Soshnikov D. V., Valieva Y., / Cornell University. Series Computer Science "arxiv.org". 2021.

In this paper, we present a new Python library called mPyPl, which is intended to simplify complex data processing tasks using functional approach. This library defines operations on lazy data streams of named dictionaries represented as generators (so-called multi-field datastreams), and allows enriching those data streams with more 'fields' in the process of data preparation ...

Added: October 7, 2021

Simulating the time projection chamber responses at the MPD detector using generative adversarial networks

A. Maevskiy, F. Ratnikov, Zinchenko A. et al., The European Physical Journal C - Particles and Fields 2021 Vol. 81 Article 599

High energy physics experiments rely heavily on the detailed detector simulation models in many tasks. Running these detailed models typically requires a notable amount of the computing time available to the experiments. In this work, we demonstrate a new approach to speed up the simulation of the Time Projection Chamber tracker of the MPD experiment at ...

Added: July 12, 2021