• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Bayesian Sparsification of Gated Recurrent Neural Networks
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.
May 22, 2026
HSE Graduates AI Project Wins at TECH & AI Awards
Daria Davydova, graduate of the HSE Graduate School of Business and Head of the AI Implementation Unit at the Artificial Intelligence Department of Alfa-Bank, received a prize at the TECH & AI Awards. She was awarded for the best AI solution for optimising business processes. The winners were determined as part of the VII Russian Summit and Awards on Digital Transformation (CDO/CDTO Summit & Awards).

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Bayesian Sparsification of Gated Recurrent Neural Networks

P. 1–6.
Lobacheva E., Chirkova N., Vetrov D.

Bayesian methods have been successfully applied to sparsify weights of neural networks and to remove structure units from the networks, e. g. neurons. We apply and further develop this approach for gated recurrent architectures. Specifically, in addition to sparsification of individual weights and neurons, we propose to sparsify preactivations of gates and information flow in LSTM. It makes some gates and information flow components constant, speeds up forward pass and improves compression. Moreover, the resulting structure of gate sparsity is interpretable and depends on the task.

Language: English
Full text
Text on another site
Keywords: recurrent neural networksрекуррентные нейронные сети

In book

Workshop on Compact Deep Neural Network Representation with Industrial Applications, Thirty-second Conference on Neural Information Processing Systems
Montréal: [б.и.], 2018.
Similar publications
Ансамбль современных моделей компьютерного зрения для задачи обнаружения дипфейков
Pikul A. S., Безопасность информационных технологий 2024 Т. 31 № 4 С. 116–127
This article explores the potential use of modern computer vision architectures for the task of deepfake detection. The following architectures are considered: EfficientNet, Vision Transformer (ViT), VisionLSTM (ViL), Vision KAN, and Mamba Vision. The novelty of the approach lies in the application and comparison of these architectures, as well as their combination into paired ensembles ...
Added: December 12, 2025
Application of Large Language Models to Solving Differential Equations: Constructing Baseline Models with LSTM and GRU
Surkov A., Zakharov V., Sergei Koltcov et al., , in: Smart Technologies, Systems and Applications: 4th International Conference, SmartTech-IC 2024, Quito, Ecuador, December 2–4, 2024, Revised Selected Papers, Part IIVol. 2: Revised Selected Papers, Part II.: Springer, 2025. P. 239–252.
Currently, large language models are actively developing and beginning to be used to solve some mathematical problems. With the emergence of xLSTM model, which demonstrates the results comparable with transformer-based models, there has been a surge of interest in recurrent neural networks. This paper considers the application of baseline recurrent models such as LSTM and ...
Added: September 11, 2025
Using a Recurrent Neural Network To Inform the Use of Prostate- specific Antigen (PSA) and PSA Density for Dynamic Monitoring of the Risk of Prostate Cancer Progression on Active Surveillance
Sushentsev N., Abrego L., Colarieti A. et al., EUROPEAN UROLOGY OPEN SCIENCE 2023 Vol. 52 P. 36–39
The global uptake of prostate cancer (PCa) active surveillance (AS) is steadily increasing. While prostate-specific antigen density (PSAD) is an important baseline predictor of PCa progression on AS, there is a scarcity of recommendations on its use in follow-up. In particular, the best way of measuring PSAD is unclear. One approach would be to use ...
Added: February 28, 2024
Прогнозирование энергопотребления на основе автоматического машинного обучения
Danilov K., Автоматизация. Современные технологии 2020 Т. 74 № август 2020 С. 402–407
Рассмотрена задача прогнозирования энергопотребления на основе автоматического машинного обучения. Приведена схема процесса автоматического создания и применения модели прогнозирова ния. Предлагаемый подход апробирован на основе данных о потреблении электроэнергии в регионах России. Проведённый вычислительный эксперимент показал высокую эффективность разработан ной модели. Точность прогнозирования составила 97...99 %. ...
Added: June 13, 2022
Self-supervised recurrent depth estimation with attention mechanisms
Makarov I., Bakhanova M., Nikolenko S. et al., PeerJ Computer Science 2022 Vol. 8 Article e865
Depth estimation has been an essential task for many computer vision applications, especially in autonomous driving, where safety is paramount. Depth can be estimated not only with traditional supervised learning but also via a self-supervised approach that relies on camera motion and does not require ground truth depth maps. Recently, major improvements have been introduced ...
Added: February 1, 2022
On the Embeddings of Variables in Recurrent Neural Networks for Source Code
Chirkova N., , in: 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2021).: Association for Computational Linguistics, 2021. P. 2679–2689.
Source code processing heavily relies on the methods widely used in natural language processing (NLP), but involves specifics that need to be taken into account to achieve higher quality. An example of this specificity is that the semantics of a variable is defined not only by its name but also by the contexts in which ...
Added: August 31, 2021
Deep learning approach for predicting functional Z-DNA regions using omics data
Beknazarov N., Jin S., Poptsova M., Scientific Reports 2020 Vol. 10 P. 19134
Computational methods to predict Z-DNA regions are in high demand to understand the functional role of Z-DNA. The previous state-of-the-art method Z-Hunt is based on statistical mechanical and energy considerations about B- to Z-DNA transition using sequence information. Z-DNA CHiP-seq experiment results showed little overlap with Z-Hunt predictions implying that sequence information only is not ...
Added: December 11, 2020
Structured Sparsification of Gated Recurrent Neural Networks
Lobacheva E., Chirkova N., Markovich A. et al., , in: Thirty-Fourth AAAI Conference on Artificial IntelligenceVol. 34.: AAAI Press, 2020. Ch. 5938 P. 4989–4996.
Added: October 29, 2020
Morphological segmentation with sequence to sequence neural network
Arefyev, N.V., Gratsianova T. Y., Popov K., , in: Computational Linguistics and Intellectual Technologies. International Conference "Dialogue 2018" Proceedings.: M.: Conference Proceedings Editorial board, 2018. P. 85–95.
Morphological segmentation is an important task of natural language processing as it can significantly improve the processing of unfamiliar and rare words in different tasks that involve text data. In this paper we present datasets in English and Russian for learning and evaluating morphological segmentation algorithms, demonstrate the method based on the sequence to sequence ...
Added: October 9, 2020
Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks
Kodryan M., Grachev A., Ignatov D. I. et al., , in: Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)Issue W19-43.: Association for Computational Linguistics, 2019. P. 40–48.
Reduction of the number of parameters is one of the most important goals in Deep Learning. In this article we propose an adaptation of Doubly Stochastic Variational Inference for Automatic Relevance Determination (DSVI-ARD) for neural networks compression. We find this method to be especially useful in language modeling tasks, where large number of parameters in ...
Added: November 1, 2019
Compression of recurrent neural networks for efficient language modeling
Grachev A., Ignatov D. I., Savchenko A., Applied Soft Computing Journal 2019 Vol. 79 P. 354–362
Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long–Short Term Memory models. We make particular attention ...
Added: June 12, 2019
Continuous Gesture Recognition from sEMG Sensor Data with Recurrent Neural Networks and Adversarial Domain Adaptation
Shpilman A., Sosin I., Kudenko D., , in: 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV).: IEEE, 2018. P. 1436–1441.
Movement control of artificial limbs has made big advances in recent years. New sensor and control technology enhanced the functionality and usefulness of artificial limbs to the point that complex movements, such as grasping, can be performed to a limited extent. To date, the most successful results were achieved by applying recurrent neural networks (RNNs), ...
Added: January 18, 2019
Bayesian Sparsification of Recurrent Neural Networks
Lobacheva E., Chirkova N., Vetrov D., , in: 1st Workshop on Learning to Generate Natural Language, International Conference on Machine Learning.: [б.и.], 2017. P. 1–8.
Recurrent neural networks show state-of-the-art results in many text analysis tasks but often require a lot of memory to store their weights. Recently proposed Sparse Variational Dropout (Molchanov et al., 2017) eliminates the majority of the weights in a feed-forward neural network without significant loss of quality. We apply this technique to sparsify recurrent neural ...
Added: October 30, 2018
SEARNN: Training RNNs with global-local losses
Leblond R., Alayrac J., Osokin A. et al., , in: Proceedings of the 6th International Conference on Learning Representations (ICLR 2018).: [б.и.], 2018. P. 1–16.
We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the "learning to search" (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an ...
Added: October 29, 2018
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit