Structured Sparsification of Gated Recurrent Neural Networks
One of the most popular approaches for neural network compression is sparsification — learning sparse weight matrices. In structured sparsification, weights are set to zero by groups corresponding to structure units, e. g. neurons. We further develop the structured sparsification approach for the gated recurrent neural networks, e. g. Long Short-Term Memory (LSTM). Specifically, in addition to the sparsification of individual weights and neurons, we propose sparsifying the preactivations of gates. This makes some gates constant and simplifies an LSTM structure. We test our approach on the text classification and language modeling tasks. Our method improves the neuron-wise compression of the model in most of the tasks. We also observe that the resulting structure of gate sparsity depends on the task and connect the learned structures to the specifics of the particular tasks.
Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long–Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression methods in the context of their exploitation on devices: pruning, quantization, and matrix decomposition approaches (low-rank factorization and tensor train decomposition, in particular). For each model we investigate the trade-off between its size, suitability for fast inference and perplexity. We propose a general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling. It has been shown in the experimental study with the Penn Treebank (PTB) dataset that the most efficient results in terms of speed and compression–perplexity balance are obtained by matrix decomposition techniques.
Movement control of artificial limbs has made big advances in recent years. New sensor and control technology enhanced the functionality and usefulness of artificial limbs to the point that complex movements, such as grasping, can be performed to a limited extent. To date, the most successful results were achieved by applying recurrent neural networks (RNNs), However, in the domain of artificial hands, experiments so far were limited to non-mobile wrists, which significantly reduces the functionality of such prostheses. In this paper, for the first time, we present empirical results on gesture recognition with both mobile and non-mobile wrists. Furthermore, we demonstrate that recurrent neural networks with simple recurrent units (SRU) outperform regular RNNs in both cases in terms of gesture recognition accuracy, on data acquired by an arm band sensing electromagnetic signals from arm muscles (via surface electromyography or sEMG). Finally, we show that adding domain adaptation techniques to continuous gesture recognition with RNN improves the transfer ability between subjects, where a limb controller trained on data from one person is used for another person.
In this article, we focus on the isolated voice command recognition for autonomous man-machine and intelligent robotic systems. We propose to create a grammar model for a small testing command set with self-loops for each state to return blank symbols for noise and out-of-vocabulary words. In addition, we use single arc connected beginning and ending of the grammar in order to filter unknown commands. As a result, the grammar is resistant to distortions and unexpected words near or inside of command. We implemented the proposed approach using Finite State Transducers in the Kaldi framework and examined it using self-recorded noised data with various level of signal-to-noise ratio. We compared recognition accuracy and average decision-making time of our approach with the state-of-the-art continuous speech recognition engines based on language models. It was experimentally shown that our approach is characterized by up to 60% higher accuracy than conventional offline speech recognition methods based on language models. The speed of utterance recognition is 3 times higher than speed of traditional continuous speech recognition algorithms.
In this paper, we consider the problem of insufficient runtime and memory-space complexities of contemporary deep convolutional neural networks in the problem of image recognition. A survey of recent compression methods and efficient neural networks architectures is provided. The experimental study is focused on the visual emotion recognition problem. We compare the computational speed and memory consumption during the training and the inference stages of such methods as the weights matrix decomposition, binarization and hashing in the visual emotion recognition problem. It is experimentally shown that the most efficient recognition is achieved with the full network binarization and matrices decomposition.
In the process of astronomical observations are collected vast amounts of data. BSA (Big Scanning Antenna) LPI used in the study of impulse phenomena, daily logs 87.5 GB of data (32 TB per year). Experts classified 83096 individual observations (on the segment of the study July 2012 - October 2013). Over 75% of the sample correspond to pulsars, twinkling springs and rapid radiotransmitter, and all other classes of observations belong to hardware failures, interference, the flight of the Earth satellite and aircraft. There were allocated 15 classes of observations.
Such a sample, divided into classes allows using the machine learning algorithms. It has become possible to develop an automated service for short-term/long-term monitoring of various classes of radio sources (including radiotransmitted different nature), monitoring the Earth's ionosphere, the interplanetary and the interstellar plasma, the search and monitoring of different classes of radio sources. Monitoring in this case refers to the automatic filtering and detection of a previously unclassified impulse phenomena.
Currently, for automatic filtering, statistical analysis methods are used. This report examines an alternative method supposed to be using neural network machine learning algorithm that processes the input into raw data and after processing by the hidden layer through the output layer determines the class of pulse phenomena.
Creating a neural network model, trained on a sample and performing a classification of previously unclassified impulse phenomena is performed using the cloud service Microsoft Azure Machine Learning Studio. The Web service has been created based on the model allows classifying single impulse phenomena in real time (Request / Reply) and data sampling for a certain period (Batch processing).
The article studies the use of machine learning algorithms in solving information security problems, namely, in the construction of next-generation intrusion detection systems (IDS). The main drawbacks of traditional IDS (based on signature rules) are considered and methods for their solution are proposed using the algorithms of machine learning. The article presents new methods of applying machine learning algorithms, with the help of which it is possible to detect both already known threats and previously not seen variations of known threats.
Proceedings of the 6th International Conference on Learning Representations (ICLR 2018)
This two-volume set LNCS 10305 and LNCS 10306 constitutes the refereed proceedings of the 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, held at Gran Canaria, Spain, in June 2019. The 150 revised full papers presented in this two-volume set were carefully reviewed and selected from 210 submissions. The papers are organized in topical sections on machine learning in weather observation and forecasting; computational intelligence methods for time series; human activity recognition; new and future tendencies in brain-computer interface systems; random-weights neural networks; pattern recognition; deep learning and natural language processing; software testing and intelligent systems; data-driven intelligent transportation systems; deep learning models in healthcare and biomedicine; deep learning beyond convolution; artificial neural network for biomedical image processing; machine learning in vision and robotics; system identification, process control, and manufacturing; image and signal processing; soft computing; mathematics for neural networks; internet modeling, communication and networking; expert systems; evolutionary and genetic algorithms; advances in computational intelligence; computational biology and bioinformatics.