?
On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay
P. 21545-21556.
Language:
English
In book
Curran Associates, Inc., 2021
Nakhodnov M., Kodryan M., Lobacheva E. et al., , in : Doklady Mathematics. Vol. 106. Issue 1: Supplement.: Pleiades Publishing, Ltd. (Плеадес Паблишинг, Лтд), 2023. P. 43-62.
Knowledge of the loss landscape geometry makes it possible to successfully explain the behavior of neural networks, the dynamics of their training, and the relationship between resulting solutions and hyperparameters, such as the regularization method, neural network architecture, or learning rate schedule. In this paper, the dynamics of learning and the surface of the standard ...
Added: June 9, 2023
Kodryan M., Lobacheva E., Nakhodnov M. et al., , in : Thirty-Sixth Conference on Neural Information Processing Systems : NeurIPS 2022. : Curran Associates, Inc., 2022. P. 14058-14070.
A fundamental property of deep learning normalization techniques, such as batch normalization, is making the pre-normalization parameters scale invariant. The intrinsic domain of such parameters is the unit sphere, and therefore their gradient optimization dynamics can be represented via spherical optimization with varying effective learning rate (ELR), which was studied previously. However, the varying ELR ...
Added: December 20, 2022
A. G. Rassadin, A. V. Savchenko, , in : Proceedings of the III International Conference on Information Technologies and Nanotechnologies (ITNT). : Самара : Новая техника, 2017. P. 649-654.
In this paper, we consider the problem of insufficient runtime and memory-space complexities of contemporary deep convolutional neural networks in the problem of image recognition. A survey of recent compression methods and efficient neural networks architectures is provided. The experimental study is focused on the visual emotion recognition problem. We compare the computational speed and ...
Added: September 8, 2017
Garipov T., Izmailov P., Подоприхин Д. А. et al., , in : Advances in Neural Information Processing Systems 31 (NIPS 2018). : [б.и.], 2018. P. 1-10.
The loss functions of deep neural networks are complex and their geometric properties are not well understood. We show that the optima of these complex loss functions are in fact connected by simple curves over which training and test accuracy are nearly constant. We introduce a training procedure to discover these high-accuracy pathways between modes. ...
Added: February 27, 2019
Nazarov A., Виноградов Ю. В., Сычев А. К., Системы высокой доступности 2018 Т. 14 № 4 С. 20-22
The article studies the use of machine learning algorithms in solving information security problems, namely, in the construction of next-generation intrusion detection systems (IDS). The main drawbacks of traditional IDS (based on signature rules) are considered and methods for their solution are proposed using the algorithms of machine learning. The article presents new methods of ...
Added: February 26, 2019
Lobacheva E., Chirkova N., Markovich A. et al., , in : Thirty-Fourth AAAI Conference on Artificial Intelligence. Vol. 34.: AAAI Press, 2020. Ch. 5938. P. 4989-4996.
Added: October 29, 2020
Gorbunov A. A., Isaev E., Samodurov V., Radio Physics and Radio Astronomy 2017 Т. 22 № 4 С. 270-275
In the process of astronomical observations are collected vast amounts of data. BSA (Big Scanning Antenna) LPI used in the study of impulse phenomena, daily logs 87.5 GB of data (32 TB per year). Experts classified 83096 individual observations (on the segment of the study July 2012 - October 2013). Over 75% of the sample ...
Added: October 15, 2017
Sokolov A., Savchenko A., , in : 2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI). : IEEE, 2021. P. 413-418.
This paper is focused on the finetuning of acoustic models for speaker adaptation goals on a given gender. We pretrained the Transformer baseline model on Librispeech-960 and conducted experiments with finetuning on the gender-specific test subsets. The obtained word error rate (WER) relatively to the baseline is up to 5% and 3% lower on male ...
Added: September 26, 2021
Demochkina P., Savchenko A., , in : Proceedings of IEEE International Russian Automation Conference (RusAutoCon 2020). : IEEE, 2020. Ch. 110. P. 610-614.
In this paper, we address the problem of detecting small objects on high-quality X-ray imagesusing deep neural networks. We propose to implement the two-stage approach, in which, firstly, input image issplit into partially overlapping blocks to make small objects more discriminative for detection. Secondly, the small blocks are fed into conventional single-shot detectors. These detectors ...
Added: October 3, 2020
[б.и.], 2018
Proceedings of the 6th International Conference on Learning Representations (ICLR 2018) ...
Added: October 29, 2018
Savchenko A., IEEE Transactions on Neural Networks and Learning Systems 2020 Vol. 31 No. 2 P. 651-660
If the training data set in image recognition task is not very large, the feature extraction with a convolutional neural network is usually applied. Here, we focus on the nonparametric classification of extracted feature vectors using the probabilistic neural network (PNN). The latter is characterized by the high runtime and memory space complexity. We propose ...
Added: November 1, 2019
Lobacheva E., Chirkova N., Kodryan M. et al., , in : Advances in Neural Information Processing Systems 33 (NeurIPS 2020). : Curran Associates, Inc., 2020. P. 2375-2385.
Added: October 29, 2020
Kopeykina Lyudmila, Savchenko A., , in : 2019 International Russian Automation Conference (RusAutoCon). : IEEE, 2019. P. 1-6.
The authors consider the problem of automatic detection of private scanned documents based on text recognition with deep neural networks. The paper suggests implementing a two-phase approach with the first stage which includes efficient EAST text detection and recognition using Tesseract OCR Engine. Secondly, the authors classify the privacy of a scanned document by deep ...
Added: October 21, 2019
Berlin : Springer, 2019
This two-volume set LNCS 10305 and LNCS 10306 constitutes the refereed proceedings of the 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, held at Gran Canaria, Spain, in June 2019.
The 150 revised full papers presented in this two-volume set were carefully reviewed and selected from 210 submissions. The papers are organized in topical sections ...
Added: July 29, 2019
Ashukha A., Vetrov D., Molchanov D. et al., , in : Workshop of the 6th International Conference on Learning Representations (ICLR). : International Conference on Learning Representations, ICLR, 2018. P. 1-6.
In this work, we investigate Batch Normalization technique and propose its probabilistic interpretation. We propose a probabilistic model and show that Batch Normalization maximazes the lower bound of its marginalized log-likelihood. Then, according to the new probabilistic model, we design an algorithm which acts consistently during train and test. However, inference becomes computationally inefficient. To ...
Added: October 31, 2018
Sokolov A., / Cornell University. Series Computer Science "arxiv.org". 2021.
Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the SER systems. Further, more clarification is required for analysing the impact of ASR's word error rate (WER) on linguistic emotion ...
Added: November 17, 2020
Belavin V., Ustyuzhanin A., Sergey Shirobokov et al., , in : Advances in Neural Information Processing Systems 33 (NeurIPS 2020). : Curran Associates, Inc., 2020. P. 14650-14662.
Added: February 14, 2021
Belomestny D., Naumov A., Puchkin N. et al., Neural Networks 2023 Vol. 161 P. 242-253
This paper investigates the approximation properties of deep neural networks with piecewise-polynomial activation functions. We derive the required depth, width, and sparsity of a deep neural network to approximate any Hölder smooth function up to a given approximation error in Hölder norms in such a way that all weights of this neural network are bounded ...
Added: July 13, 2022
Sokolov A., Savchenko A., , in : 17th World Symposium on Applied Machine Intelligence and Informatics (SAMI). : IEEE, 2019. Ch. 19. P. 113-116.
In this article, we focus on the isolated voice command recognition for autonomous man-machine and intelligent robotic systems. We propose to create a grammar model for a small testing command set with self-loops for each state to return blank symbols for noise and out-of-vocabulary words. In addition, we use single arc connected beginning and ending ...
Added: October 21, 2019