Reinforcement Procedure for Randomized Machine Learning

Yuri S. Popkov; Y. A. Dubnov; Alexey Yu. Popkov

doi:10.3390/math11173651

Publications

?

Reinforcement Procedure for Randomized Machine Learning

Mathematics. 2023. Vol. 11. No. 17. Article 3651.

Yuri S. Popkov, Dubnov Y. A., Alexey Yu. Popkov

This paper is devoted to problem-oriented reinforcement methods for the numerical implementation of Randomized Machine Learning. We have developed a scheme of the reinforcement procedure based on the agent approach and Bellman’s optimality principle. This procedure ensures strictly monotonic properties of a sequence of local records in the iterative computational procedure of the learning process. The dependences of the dimensions of the neighborhood of the global minimum and the probability of its achievement on the parameters of the algorithm are determined. The convergence of the algorithm with the indicated probability to the neighborhood of the global minimum is proved.

Research target: Mathematics Computer Science

Keywords: reinforcement learning Bellman’s optimality principle randomized machine learning

Чеповский А.М. Анализ корпусов текстов на естественных языках. Математические методы. Учебное пособие – М.: Мастерская Печати Идей, 2026. – 274 с.: илл.

Chepovskiy A., Мастерская Печати Идей, 2026.

The textbook presents methods and algoгithms for automatic analysis of соrроrа of texts in natural languages. It is intended fоr sfudenБ of methods of processing texts in паtчrаl languages and creating training arays of texts. Fоr students, graduate students and researchers studying methods of computational linguistics and word processing. ...

Added: August 1, 2026

Radomskii A., Mathematical notes 2026 Vol. 119 No. 6 P. 1136–1147

We obtain an upper bound for the sum $\sum_{n\leq N} (a_{n}/\varphi (a_{n}))^{s}$, where $\varphi$ is Euler's totient function, $s\in\mathbb{N}$, and $a_{1},\ldots, a_{N}$ are positive integers (not necessarily distinct) with some restrictions. As applications, for any $t>0$, we obtain an upper bound for the number of $n\in [1,N]$ such that $a_{n}/ \varphi (a_{n})> t$. ...

Added: July 31, 2026

Квадратичный закон взаимности и его обобщения

Абызов А. Н., Буутай П. Н., Математика и теоретические компьютерные науки 2026 Т. 4 № 2 С. 4–75

This paper is expository and methodological in nature and is devoted to the development of E.I. Zolotarev’s ideas embedded in his approach to the proof of the quadratic reciprocity law (1872). We consider extensions of Zolotarev’s approach to abstract number rings presented in the work of A. Brunyate and P.L. Clark (2015), and to finite ...

Added: July 30, 2026

Three Algorithms for Merging Hierarchical Navigable Small World Graphs

Ponomarenko A., / Series Computer Science "arxiv.org". 2025.

This paper addresses the challenge of merging hierarchical navigable small world (HNSW) graphs, a critical operation for distributed systems, incremental indexing, and database compaction. We propose three algorithms for this task: Naive Graph Merge (NGM), Intra Graph Traversal Merge (IGTM), and Cross Graph Traversal Merge (CGTM). These algorithms differ in their approach to vertex selection ...

Added: July 30, 2026

Профессиональная верификация: Руководство по продвинутой функциональной верификации

Уилкокс П., Romanov A., М.: ДМК Пресс, 2025.

Книга, которую вы держите в руках, продолжает серию «Книжная полка истового инженера», которая издается при поддержке компании YADRO. Данная книга представляет собой учебник по теоретическим основам продвинутой функциональной верификации и содержит лучшие практики, используемые в настоящее время. В ней подробно описана унифицированная методология верификации (UVM) и раскрыты такие темы, как функциональный виртуальный прототип, функциональное покрытие, утверждения, формальная верификация, тестбенчи, косимуляция, эмуляция, аппаратное ...

Added: July 30, 2026

EEG evidence for reproducible neural states during Buddhist Highest Yoga Tantra meditation

Mikhaylets E. V., Razorenova A. М., Chernyshev V. L. et al., Scientific Reports 2026 Vol. 16 Article 23560

Meditation offers a naturalistic paradigm for studying introspection, yet the neural dynamics of advanced tantric practices remain largely unexplored. Buddhist Highest Yoga Tantra (BHYT) comprises a sequence of eight dissolution stages culminating in the “clear light” state. We recorded EEG during eyes-closed BHYT meditation performed in monasteries and hermitages (51 sessions from 36 male practitioners; ...

Added: July 29, 2026

Произведения Масси и соотношения в когомологиях алгебр Стинрода

Попеленский Ф. Ю., Математический сборник 2026 Т. 217 № 2 С. 108–153

In a recent paper Buchstaber and the author introduced a new structure on the cohomology of Hopf algebras in terms of the Buchstaber spectral sequence (Bss). We fully calculate this structure on the cohomology (known for a long time) of the important Hopf subalgebra A(1) of the classical Steenrod algebra A2. As part of a demonstration ...

Added: July 28, 2026

Three-dimensional magnetization textures as quaternionic functions

Metlov K., Andrei B. Bogatyrëv, Annalen der Physik 2026 Vol. 538 No. 6 Article e70234

Thanks to the recent progress in bulk full three-dimensional nanoscale magnetization distribution imaging, there is a growing interest to three-dimensional (3D) magnetization textures, promising new high information density spintronic applications. Compared to 1D domain walls or 2D magnetic vortices/skyrmions, they are a much harder challenge to represent, analyze and reason about. Here we build analytical representation for such ...

Added: July 28, 2026

Machine Learning-based Adaptive Reconstruction of Video Stream Fragments Taking into Account Scene Dynamics. Proceedings of the Institute for System Programming of the RAS

Думкин Н. А., Alexandrov D., Прозорский М. А., Труды Института системного программирования РАН 2026 Т. 38 № 1 С. 255–274

A theoretically sound approach to adaptive client-side video fragment restoration is proposed using machine learning and scene analysis methods. The method includes a formal problem statement, a finite-state machine model for decision making, a restoration cost function, and a new stage in video preparation: scene dynamics assessment followed by recording a feature in an HLS playlist. This feature ...

Added: July 27, 2026

Nonlinear Neumann eigenvalues in outward cuspidal domains with weighted measure

Menovshchikov A., Ukhlov A., Rendiconti del Circolo Matematico di Palermo 2026 Vol. 75 Article 91

We consider the nonlinear Neumann eigenvalue problem in outward cuspidal domains with a weighted measure. Using composition operators on Sobolev spaces, we establish embeddings of Sobolev spaces into weighted Lebesgue spaces. These embeddings give the solvability of the Neumann spectral problem in this setting and provide estimates for the corresponding weighted Neumann eigenvalues. ...

Added: July 27, 2026

On the (p,q)-Eigenvalues of the No-Flux p-Laplacian

Menovshchikov A., Journal of Mathematical Sciences 2026 Vol. 298 P. 608–618

We study the set of (p, q)-eigenvalues of the p-Laplace operator with no-flux boundary conditions. We show that this set is closed and that its smallest positive element (the first nontrivial eigenvalue) admits a variational characterization. Moreover, we establish lower bounds for this eigenvalue in cuspidal domains. ...

Added: July 27, 2026

Automated Reasoning: 13th International Joint Conference, IJCAR 2026, Lisbon, Portugal, July 26–29, 2026, Proceedings, Part II. (LNCS, volume 16689)

Cham: Springer, 2026.

This open access set, LNAI 16688-16689, constitutes the proceedings of the 13th International Joint Conference, IJCAR 2026, held in Lisbon, Portugal, during July 26–29, 2026. The 41 full research papers and 8 short papers included in these two volumes were carefully reviewed and selected from 112 submissions. The papers cover the following topical sections: Part I: Theorem ...

Added: July 26, 2026

Local Fault-Tolerant Routing in 3D Mesh NoCs using Single-Hop Rollback

Edward R. Rzaev, Aleksandr Y. Romanov, Andrey M. Sukhov, IEEE Access 2026 Vol. 14 P. 2169–3536

This work presents a hierarchy of strictly local fault-tolerant routing algorithms for 3D mesh networks-on-chip, culminating in an algorithm that combines a live-neighbor selection rule with a bounded single-hop rollback mechanism. The proposed algorithms operate exclusively on immediate neighbor information, maintain O(1) per hop complexity, and require no global topology knowledge, additional virtual channels, or ...

Added: July 23, 2026

Библиометрия фольклора: русские пословицы в научных журналах

Pislyakov V., Вестник Томского государственного университета. Филология 2026 № 101 С. 175–192

This article examines the use of proverbs in academic texts—specifically, articles published in Russian research journals. For the experiment, ten proverbs were selected as the intersection of two fundamentally different paremiological surveys aimed at compiling lists of popular or common Russian proverbs. One of these surveys was conducted by the classic of paremiology, G.L. Permyakov, ...

Added: July 22, 2026

SIGIR '26: Proceedings of the 49th International ACM SIGIR Conference on Research and Development in Information Retrieval

Association for Computing Machinery (ACM), 2026.

Wominjeka, and welcome to the 49th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2026), held in Melbourne | Naarm, Australia, from 20–24 July 2026. SIGIR 2026 takes place on the unceded lands of the Woi Wurrung and Boon Wurrung language groups of the eastern Kulin nation, and we pay our ...

Added: July 22, 2026

Long-range machine-learning potentials with environment-dependent charges enable predicting LO-TO splitting and dielectric constants

Korogod D., Shapeev A., Ivan S. Novikov, Physical Review B: Condensed Matter and Materials Physics 2026 Vol. 114 No. 2 Article 024104

We present two models with explicit long-range electrostatics in the form of Coulomb interactions. Both models include point charges depending on their local atomic environments, and the second model also conserves a total charge of an atomic system. We combine the proposed long-range models with the local moment tensor potential (MTP) and demonstrate that they ...

Added: July 22, 2026

Global optimization of atomic clusters via physically constrained tensor train decomposition

Sozykin K., Rybin N., Chertkov A. et al., Physical Review B: Condensed Matter and Materials Physics 2026 Vol. 113 No. 22 Article 224111

The global optimization of atomic clusters represents a fundamental challenge in computational chemistry and materials science due to the exponential growth of local minima with system size (i.e., the curse of dimensionality). We introduce a framework that overcomes this limitation by exploiting the low-rank structure of potential energy surfaces through tensor train (TT) decomposition. Our ...

Added: July 22, 2026

Kolmogorov Operators and Their Applications

Singapore: Springer, 2024.

Included in the following conference series: INdAM: INdAM Meeting: Kolmogorov Operators and their Applications Workshop Conference proceedings info: INdAM 2022 Kolmogorov equations are a fundamental bridge between the theory of partial differential equations and that of stochastic differential equations that arise in several research fields. This volume collects a selection of the talks given at the Cortona meeting by ...

Added: July 17, 2026

Разработка микросервиса ADP для идентификации источников выбросов на основе машинного обучения с подкреплением

Kychkin A., Chernitsin I., Прикладная информатика 2026 № 1(121) С. 40–58

The results of the development of a software microservice embedded in atmospheric air quality monitoring systems to support the identification of industrial pollution sources are presented. The emission and subsequent spread of harmful substances in the lower layers of the atmosphere is dynamic and characterized by high uncertainty due to the specific features of technological ...

Added: April 23, 2026

Artificial Neural Networks and Machine Learning. ICANN 2025 International Workshops and Special Sessions: 34th International Conference on Artificial Neural Networks, Kaunas, Lithuania, September 9–12, 2025, Proceedings, Part V

Cham: Springer, 2025.

This book constitutes the refereed proceedings of 34th International Workshops which were held in conjunction with the 34th International Conference on Artificial Neural Networks and Machine Learning, ICANN 2025, held in Kaunas, Lithuania, September 9–12, 2025. The 20 full papers and 8 abstracts included in this workshop volume were carefully reviewed and selected from 42 submissions. ...

Added: September 29, 2025

Analysis of a Company Model in Conditions of Unstable Demand Using Reinforcement Learning Methods

Delev A., Semakov S., , in: 2025 8th International Conference on Artificial Intelligence and Big Data (ICAIBD).: IEEE, 2025. P. 318–322.

Profit is one of the most important economic indicators of a company’s performance, and for every company it is necessary to allocate resources in such a way as to obtain the maximum possible profit. The profit maximization problem is usually a dynamic optimization problem. This article discusses an approach to solving the production expansion problem ...

Added: August 25, 2025

Pseudo-collusion in a centralized algorithmic financial market

Pastushkov A., Boulatov A., Finance Research Letters 2025 Vol. 83 Article 107671

Recent studies have increasingly explored whether reinforcement learning algorithms can give rise to cooperative behavior that results in non-competitive pricing across various market settings. In financial markets, Cartea et al. (2022) show that market makers using multi-armed bandit (MAB) algorithms generally converge to competitive pricing in quote-driven over-the-counter (OTC) markets, barring some unlikely exceptions where ...

Added: June 19, 2025

The beer game bullwhip effect mitigation: a deep reinforcement learning approach

Rozhkov M., Alyamovskaya N., Zakhodiakin G., International Journal of Production Research 2025 Vol. 63 No. 18 P. 6630–6647

This article investigates the application of reinforcement learning (RL) methods to optimise a four-echelon linear supply chain model with stochastic demand. The proposed supply chain configuration is largely based on the production-distribution supply chain of the MIT Supply Chain Beer Game. We show that RL can significantly improve ordering efficiency and overall supply chain performance. ...

Added: March 24, 2025

Deep Reinforcement Learning-Based Congestion Control for File Transfer over QUIC

Blokhin A., Kalev V., Pusev R. et al., , in: 2024 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON).: Novosibirsk: IEEE, 2024. P. 25–30.

Congestion control is one of the key mechanisms of communication in QUIC protocol which controls how much data and at which rate can be send to an endpoint at particular moment of time for better use of shared network resources and avoids moving into congestive collapse state. In this work we tackle the problem of ...

Added: December 18, 2024