• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Articles
  • Reinforcement Procedure for Randomized Machine Learning
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 5, 2026
Neural Network Maps as a Method for Constructing Mathematical Models
Scientists from HSE University–Nizhny Novgorod and the Institute of Physics Belgrade, Serbia, are jointly exploring the application of machine learning techniques and neural networks to the study of nonlinear dynamics. Natalya Stankevich, Leading Research Fellow at the Laboratory of Topological Methods in Dynamics of the Faculty of Informatics, Mathematics, and Computer Science at HSE University–Nizhny Novgorod, spoke to the HSE News Service about this international project.
June 5, 2026
‘In the Age of Technology, It Is Interesting to Look into the Past and Think about What We Can Take from It
Polina Tabakova decided to apply for a Philology degree at HSE in Nizhny Novgorod because she grew up in Mari El and did not want to move far away from the Russian forests. In an interview for the Young Scientists of HSE University project, she spoke about the genre of the campus novel, the existential drama of Kolobok, and a blackout version of Eugene Onegin.
June 5, 2026
HSE Scientists Develop Method to Compress Large Language Models Without Losing Quality
Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science have developed a new compression method for large language models such as GPT and LLaMA that reduces their size by 25–36% without additional training or significant loss of accuracy. This is the first approach to use mathematical transformations—specifically, rotations of model weights—to make models more amenable to compression with structured matrices. The study results have been published in ACL Findings 2025. The code is available on GitHub.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Reinforcement Procedure for Randomized Machine Learning

Mathematics. 2023. Vol. 11. No. 17. Article 3651.
Yuri S. Popkov, Dubnov Y. A., Alexey Yu. Popkov

This paper is devoted to problem-oriented reinforcement methods for the numerical implementation of Randomized Machine Learning. We have developed a scheme of the reinforcement procedure based on the agent approach and Bellman’s optimality principle. This procedure ensures strictly monotonic properties of a sequence of local records in the iterative computational procedure of the learning process. The dependences of the dimensions of the neighborhood of the global minimum and the probability of its achievement on the parameters of the algorithm are determined. The convergence of the algorithm with the indicated probability to the neighborhood of the global minimum is proved.

Research target: Mathematics Computer Science
Language: English
Full text
DOI
Text on another site
Keywords: reinforcement learningBellman’s optimality principle randomized machine learning
Similar publications
Wave dynamics within the Whitham-Ostrovsky equation
Flamarion M. V., Pelinovsky E., Nonlinear Dynamics 2026 Vol. 114 Article 784
In this article, we investigate wave packet and solitary wave dynamics in the Whitham–Ostrovsky (WO) equation. By means of a multiple-scales expansion, we formally derive a nonlinear Schrödinger (NLS) equation governing the envelope evolution.The corresponding modulational stability diagram is then obtained using the Lighthill criterion. We show that sufficiently large values of the low-frequency dispersive term render ...
Added: June 5, 2026
On structural stability of 3-diffeomorphisms with the Smale solenoid attractor–repeller dynamics
Medvedev T. V., Pochinka O., Chaos 2026 Vol. 36 No. 6 Article 063107
We consider 3-diffeomorphisms with source–sink dynamics where Smale solenoids play the role of the source and the sink (NSSS-diffeomorphisms). It is known that such diffeomorphisms exist only on lens spaces. On the 3-sphere, every NSSS-diffeomorphism is associated with an exchangeable braid. An exchangeable braid with the strand number n was constructed for each n   3 in such a way ...
Added: June 4, 2026
Proceedings of the 43rd International Conference on Machine Learning (ICML 2026)
Seul: PMLR, 2026.
Added: June 4, 2026
A model exhibiting all possible types of hyperbolic chaos on the 2-torus
Kazakov A., Shilov O. M., Mints D. et al., Chaos 2026 Vol. 36 No. 6 Article 063112
We study hyperbolic chaotic dynamics for maps of a two-dimensional torus. We introduce a two-parameter family of diffeomorphisms which, as we show, demonstrates all types of hyperbolic chaotic dynamics that can appear in the two-dimensional case. In addition, we describe all the bifurcations responsible for the transitions between these chaotic regimes. ...
Added: June 4, 2026
Об эквивалентности по надстройке декартовых произведений регулярных гомеоморфизмов с гомеоморфизмами Данжуа
Nozdrinova E., Pochinka O., Shmukler V., Математический сборник 2026 Т. 217 № 6 С. 71–89
Гомеоморфизмы топологических пространств называются эквивалентными по надстройке, если надстройки над ними топологически эквивалентны. В частности, топологически сопряженные гомеоморфизмы эквивалентны по надстройке. Известно, что для гомологически неприводимых гомеоморфизмов их топологическая сопряженность является необходимым и достаточным условием их эквивалентности по надстройке. Тогда как инварианты топологической сопряженности гомологически приводимых гомеоморфизмов во многих случаях являются избыточными для эквивалентности по ...
Added: June 3, 2026
Случайные блуждания на симметрических пространствах некомпактного типа ранга 1
Gnetov F., Konakov V., Успехи математических наук 2026 Т. 81 № 3 (489) С. 161–162
Пусть M обозначает симметрическое пространство некомпактного типа ранга 1. Опираясь на фундаментальную работу [1], в [2] было показано, что плотность соответствующим образом нормированной суммы независимых Hn-значных случайных величин, определенная через сложение Мёбиуса в модели шара Пуанкаре, сходится к фундаментальному решению соответствующего уравнения теплопроводности. Пределом являлся нормальный закон на Hn, соответствующий ядру теплопроводности, определяемому оператором Лапласа–Бельтрами. ...
Added: June 2, 2026
OpenAtom Foundation. Консорциум, развивающий Open Source в Китае.
Silakov D., Системный администратор 2026 № 3 С. 28–33
В статье про платформы для разработки открытого ПО в Китае мы рассказали про GitCode – молодой проект, позиционируемый как площадка для разработчиков со всего мира. Сейчас на GitCode размещаются проекты, созданные в КНР, но некоторые из них уже известны и на международной арене. Помочь открытым проектам в становлении, развитии и расширению аудитории призван фонд OpenAtom ...
Added: June 2, 2026
The recognition-by-components method
Slivnitsin P., Mylnikov L., Engineering Applications of Artificial Intelligence 2026 Vol. 179 Article 115185
The paper describes a applied artificial intelligence task of recognition-by-components method of real objects based on the recognition of a limited set of primitives or components. The recognition-by-components makes it possible to determine the components, that compose an object, and increase the number of recognizable objects without degrading the recognition quality. Training is performed on ...
Added: May 29, 2026
Electrical networks and data analysis in phylogenetics
Gorbounov Vassily, Kazakov A., Data Analytics and Topology 2025 Vol. 1 No. 1 P. 33–45
A classic problem in data analysis is studying the systems of subsets defined by either a similarity or a dissimilarity function on X which is either observed directly or derived from a data set. For an electrical network there are two functions on the set of the nodes defined by the resistance matrix and the response ...
Added: May 28, 2026
Brain-Computer Interfaces for Gait Rehabilitation After Stroke A Scoping Review
Mokienko O., Zisman M. A., Bobrov P. et al., American Journal of Physical Medicine and Rehabilitation 2026 Vol. 105 No. 6 P. 555–563
Brain-computer interfaces (BCIs) represent a promising technology for restoring lower limb motor functions and gait after stroke. The application of BCIs in this field is supported by a limited number of studies. The objective of the review was to systematically and critically evaluate the current evidence on the use of BCIs for lower limb function ...
Added: May 28, 2026
Generalizing the Brady-Yong Algorithm: Efficient Fast Hough Transform for Arbitrary Image Sizes
Kazimirov D., Rybakova E., Vitalii V. Gulevskii et al., IEEE Access 2025 Vol. 13 P. 20101–20132
The Hough (discrete Radon) transform (HT/DRT) is a digital image processing tool that has become indispensable in many application areas, ranging from general image processing to neural networks and X-ray computed tomography. The utilization of the HT in applied problems demands its computational efficiency and increased accuracy. The de facto standard algorithm for the fast ...
Added: May 28, 2026
Universal Comparison Methodology for Hough Transform Approaches
Kazimirov D., Vitalii Gulevskii, Kroshnin A. et al., Mathematics 2026 Article 1136
The Hough transform (HT) is widely used in computer vision, tomography, and neural networks. Numerous algorithms for HT computation have been proposed, making their systematic comparison essential. However, existing comparative methodologies are either non-universal and limited to certain HT formulations, or task-oriented, relying on application-specific criteria that do not fully capture algorithmic properties. This paper ...
Added: May 28, 2026
Разработка микросервиса ADP для идентификации источников выбросов на основе машинного обучения с подкреплением
Kychkin A., Chernitsin I., Прикладная информатика 2026 № 1(121) С. 40–58
The results of the development of a software microservice embedded in atmospheric air quality monitoring systems to support the identification of industrial pollution sources are presented. The emission and subsequent spread of harmful substances in the lower layers of the atmosphere is dynamic and characterized by high uncertainty due to the specific features of technological ...
Added: April 23, 2026
Artificial Neural Networks and Machine Learning. ICANN 2025 International Workshops and Special Sessions: 34th International Conference on Artificial Neural Networks, Kaunas, Lithuania, September 9–12, 2025, Proceedings, Part V
Cham: Springer, 2025.
This book constitutes the refereed proceedings of 34th International Workshops which were held in conjunction with the 34th International Conference on Artificial Neural Networks and Machine Learning, ICANN 2025, held in Kaunas, Lithuania, September 9–12, 2025.   The 20 full papers and 8 abstracts included in this workshop volume were carefully reviewed and selected from 42 submissions. ...
Added: September 29, 2025
Analysis of a Company Model in Conditions of Unstable Demand Using Reinforcement Learning Methods
Delev A., Semakov S., , in: 2025 8th International Conference on Artificial Intelligence and Big Data (ICAIBD).: IEEE, 2025. P. 318–322.
Profit is one of the most important economic indicators of a company’s performance, and for every company it is necessary to allocate resources in such a way as to obtain the maximum possible profit. The profit maximization problem is usually a dynamic optimization problem. This article discusses an approach to solving the production expansion problem ...
Added: August 25, 2025
Pseudo-collusion in a centralized algorithmic financial market
Pastushkov A., Boulatov A., Finance Research Letters 2025 Vol. 83 Article 107671
Recent studies have increasingly explored whether reinforcement learning algorithms can give rise to cooperative behavior that results in non-competitive pricing across various market settings. In financial markets, Cartea et al. (2022) show that market makers using multi-armed bandit (MAB) algorithms generally converge to competitive pricing in quote-driven over-the-counter (OTC) markets, barring some unlikely exceptions where ...
Added: June 19, 2025
The beer game bullwhip effect mitigation: a deep reinforcement learning approach
Rozhkov M., Alyamovskaya N., Zakhodiakin G., International Journal of Production Research 2025 Vol. 63 No. 18 P. 6630–6647
This article investigates the application of reinforcement learning (RL) methods to optimise a four-echelon linear supply chain model with stochastic demand. The proposed supply chain configuration is largely based on the production-distribution supply chain of the MIT Supply Chain Beer Game. We show that RL can significantly improve ordering efficiency and overall supply chain performance. ...
Added: March 24, 2025
Deep Reinforcement Learning-Based Congestion Control for File Transfer over QUIC
Blokhin A., Kalev V., Pusev R. et al., , in: 2024 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON).: Novosibirsk: IEEE, 2024. P. 25–30.
Congestion control is one of the key mechanisms of communication in QUIC protocol which controls how much data and at which rate can be send to an endpoint at particular moment of time for better use of shared network resources and avoids moving into congestive collapse state. In this work we tackle the problem of ...
Added: December 18, 2024
Generative Flow Networks as Entropy-Regularized RL
Tiapkin D., Morozov N., Naumov A. et al., , in: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024), 2-4 May 2024, Palau de Congressos, Valencia, Spain. PMLR: Volume 238Vol. 238.: Valencia: PMLR, 2024. P. 4213–4221.
The recently proposed generative flow networks (GFlowNets) are a method of training a policy to sample compositional discrete objects with probabilities proportional to a given reward via a sequence of actions. GFlowNets exploit the sequential nature of the problem, drawing parallels with reinforcement learning (RL). Our work extends the connection between RL and GFlowNets to ...
Added: June 22, 2024
Model-free Posterior Sampling via Learning Rate Randomization
Tiapkin D., Belomestny D., Calandriello D. et al., , in: Advances in Neural Information Processing Systems 36 (NeurIPS 2023).: Curran Associates, Inc., 2023. P. 73719–73774.
Added: February 17, 2024
Randomized Machine Learning Algorithms to Forecast the Evolution of Thermokarst Lakes Area in Permafrost Zones
Yu. A. Dubnov, A. Yu. Popkov, Polishchuk V. Y. et al., Automation and Remote Control 2023 Vol. 84 No. 1 P. 64–81
Randomized machine learning focuses on problems with considerable uncertainty in data and models. Machine learning algorithms are formulated in terms of a functional entropylinear programming problem. We adapt these algorithms to forecasting problems on an example of the evolution of thermokarst lakes area in permafrost zones. Thermokarst lakes generate methane, a greenhouse gas affecting climate ...
Added: February 5, 2024
Fast Rates for Maximum Entropy Exploration
Tiapkin D., Belomestny D., Calandriello D. et al., , in: Proceedings of the 40th International Conference on Machine Learning: Volume 202: International Conference on Machine Learning, 23-29 July 2023, Honolulu, Hawaii, USAVol. 202: International Conference on Machine Learning, 23-29 July 2023, Honolulu, Hawaii, USA.: PMLR, 2023. P. 34161–34221.
Added: December 1, 2023
Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms
Tiapkin D., Belomestny D., Naumov A. et al., Working papers by Cornell University. Series math "arxiv.org" 2023 Article 2304.03056
In this work, we derive sharp non-asymptotic deviation bounds for weighted sums of Dirichlet random variables. These bounds are based on a novel integral representation of the density of a weighted Dirichlet sum. This representation allows us to obtain a Gaussian-like approximation for the sum distribution using geometry and complex analysis methods. Our results generalize ...
Added: June 28, 2023
Variance Reduction for Policy-Gradient Methods via Empirical Variance Minimization
Belomestny D., Kaledin M., Golubev A., /. 2022.
Policy-gradient methods in Reinforcement Learning(RL) are very universal and widely applied in practice but their performance suffers from the high variance of the gradient estimate. Several procedures were proposed to reduce it including actor-critic(AC) and advantage actor-critic(A2C) methods. Recently the approaches have got new perspective due to the introduction of Deep RL: both new control ...
Added: April 14, 2023
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit