?
Markov Decision Processes and Stochastic Games with Total Effective Payoff
P. 103–115.
Gurvich V., Boros E., Elbassioni K., Makino K.
We consider finite Markov decision processes (MDPs) with undiscounted total effective payoff. We show that there exist uniformly optimal pure stationary strategies that can be computed by solving a polynomial number of linear programs. We apply this result to two-player zero-sum stochastic games with perfect information and undiscounted total effective payoff, and derive the existence of a saddle point in uniformly optimal pure stationary strategies.
In book
Vol. 30. , Dagstuhl: Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2015.
Belomestny D., Schoenmakers J., Zorina V., Journal of Complexity 2025 Vol. 88 Article 101932
We introduce a mesh-type approach for tackling discrete-time, finite-horizon Markov Decision Processes (MDPs) characterized by state and action spaces that are general, encompassing both finite and infinite (yet suitably regular) subsets of Euclidean space. In particular, for bounded state and action spaces, our algorithm achieves a computational complexity that is tractable in the sense of ...
Added: November 10, 2025
Sapronov Y., Yudin N., Computational Mathematics and Mathematical Physics 2025 Vol. 65 No. 3 P. 567–581
We continue to develop the concept of studying the ε-optimal policy for Average Reward Markov Decision Processes (AMDP) by reducing it to Discounted Markov Decision Processes (DMDP). Existing research often stipulates that the discount factor must not fall below a certain threshold. Typically, this threshold is close to one, and as is well-known, iterative methods ...
Added: June 10, 2025
Kolesnikov A., Popova S., / Series arXiv "math". 2024.
We consider the problem of optimal exchange which can be formulated as a kind of optimal transportation problem. The existence of an optimal solution and a duality theorem for the optimal exchange problem are proved in case of completely regular topological spaces. We show the connection between the problem of optimal exchange and the optimal ...
Added: December 20, 2024
Rudenko V., Yudin N., Васин А. А., Компьютерные исследования и моделирование 2023 Т. 15 № 2 С. 329–353
This article reviews both historical achievements and modern results in the field of Markov Decision Process (MDP) and convex optimization. This review is the first attempt to cover the field of reinforcement learning in Russian in the context of convex optimization. The fundamental Bellman equation and the criteria of optimality of policy — strategies based on it, ...
Added: November 29, 2024
Gladin E., Lavrik-Karmazin M., Zainullina K. et al., Proceedings of Machine Learning Research 2023 Vol. 206 P. 11506–11533
The problem of constrained Markov decision process is considered. An agent aims to maximize the expected accumulated discounted reward subject to multiple constraints on its costs (the number of constraints is relatively small). A new dual approach is proposed with the integration of two ingredients: entropy-regularized policy optimizer and Vaidya’s dual optimizer, both of which ...
Added: November 6, 2024
Gribanov D., Malyshev D., Shumilov I., Operations Research Forum 2024 Vol. 5 Article 32
In our note, we present a very simple and short proof of a new interesting fact about
the faces of an integer hull of a given rational polyhedron. This fact has a complete
analog in linear programming theory and can be useful to establish new constructive
upper bounds on the number of vertices in an integer hull of ...
Added: April 4, 2024
Ignatov A., , in: 14th International Conference, OPTIMA 2023, Petrovac, Montenegro, September 18–22, 2023, Revised Selected Papers. Communications in Computer and Information Science (CCIS, volume 1913)Vol. 1913.: Springer, 2023. P. 173–187.
Modeling protein folding, which is the process by which a protein obtains its spacial shape, still remains a challenging problem. Protein geometry might be simplified by using the coarse-grained models. The highest level of simplification is achieved in HP-models where only polarity of amino acid residues is considered, and the unified monomers are located in nodes ...
Added: January 18, 2024
Маршалко Г. Б., Математические вопросы криптографии 2014 Vol. 5 No. 2 P. 87–98
We show that neuron weights used in neural network-based biometric authentication scheme defined in GOST R 52633 standard series contain all the information on biometric data and secret key of the legitimate user. So, the complexity of evaluating (with known tables of neuron weights) the legitimate user's secret key is equivalent to the complexity of evaluating ...
Added: October 7, 2022
Chemodanov D., Esposito F., Calyam P. et al., IEEE Transactions on Network and Service Management 2019 Vol. 16 No. 1 P. 127–142
Virtual network services that span multiple data centers are important to support emerging data-intensive applications in fields such as bioinformatics and retail analytics. Successful virtual network service composition and maintenance requires flexible and scalable “constrained shortest path management” both in the management plane for virtual network embedding (VNE) or network function virtualization service chaining (NFV-SC), ...
Added: December 3, 2019
Гурарий М. М., Жаров М. М., Русаков С. Г. et al., Информационные технологии 2018 Т. 24 № 7 С. 435–444
The directions of improvement of minimax methods for circuit design problems are considered. The choices of generalized quality criterion for the circuit design is discussed. It is concluded that the minimax criterion has advantages over other formulations of design targets. New approach to setting of individual objectives for each performance indicator is proposed. The approach ...
Added: February 12, 2019
Gurvich V., Boros E., Elbassioni K. et al., Dynamic Games and Applications 2018 Vol. 8 No. 1 P. 22–41
We suggest a new algorithm for two-person zero-sum undiscounted stochastic games focusing on stationary strategies. Given a positive real εε, let us call a stochastic game εε-ergodic, if its values from any two initial positions differ by at most εε. The proposed new algorithm outputs for every ε>0ε>0 in finite time either a pair of stationary strategies for the two ...
Added: October 10, 2018
Lazarev A. A., Pravdivets N., Nekrasov I., Algorithms 2018 Vol. 11 No. 4 P. 1–13
We consider one approach to formalize the Resource-Constrained Project Scheduling Problem (RCPSP) in terms of combinatorial optimization theory. The transformation of the original problem into combinatorial setting is based on interpreting each operation as an atomic entity that has a defined duration and has to be resided on the continuous time axis meeting additional restrictions. ...
Added: October 1, 2018
Mikheev A. V., В кн.: Современное образование: содержание, технологии, качество. Материалы XXIV международной научно-методической конференции.Т. 2.: СПб.: Издательство СПбГЭТУ "ЛЭТИ", 2018. С. 55–56.
The issue of using the MathCAD software package in a university educational course for learning to solve optimization problems is considered. The advantage of working with this program is shown and its main features are discussed in the appendix to this course. ...
Added: April 24, 2018
Boros E., Elbassioni K., Gurvich V. et al., Optimization Letters 2017 Vol. 11 No. 8 P. 1499–1512
We consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph (Formula presented.), with local rewards (Formula presented.), and three types of positions: black (Formula presented.), white (Formula presented.), and random (Formula presented.) forming a partition of V. It is a long-standing open question whether a polynomial time algorithm ...
Added: May 18, 2017
Kashtanov V., Зайцева О. Б., М.: КУРС: ИНФРА-М, 2016.
Contents of the book is divided into 2 parts of deterministic and stochastic models of Operations Research.
The first part of "Deterministic models of Operations Research" - is the base section, in which the emphasis is on linear programming.
The second part - "Stochastic models of Operations Research" includes a model of reliability and queuing models. This ...
Added: November 13, 2016
Belenky A., Applied Mathematics Letters 2007 Vol. 20 No. 7 P. 795–799
The maximin of a function being the minimum function of a sum of two bilinear functions with one and the same first vector argument belonging to a polyhedron is considered on a polyhedron of connected variables forming two second vector arguments of the bilinear functions. It is shown that finding the exact lower estimate of ...
Added: October 21, 2016
Belenky A., Egorova L., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2016. No. 02.
The paper presents two new approaches to modeling the interaction of small and medium price-taking traders with a stock exchange. In the framework of these approaches, the traders can form and manage their portfolios of financial instruments traded on a stock exchange with the use of linear, integer, and mixed programming techniques. Unlike previous authors’ ...
Added: June 27, 2016
Belenky A., Egorova L., , in: Advances in Intelligent Systems and ComputingIssue 359: Modelling, Computation and Optimization in Information Systems and Management Sciences.: Switzerland: Springer, 2015. P. 257–268.
The paper discusses a new approach to developing tools for quantitatively analyzing the financial behavior of small and medium price-taking traders each possessing abilities to predict share price values for a set of financial securities traded in a stock exchange. Tools for forming and managing a trader’s portfolio of securities from this set are proposed. ...
Added: June 1, 2015
Belenky A., Egorova L., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2015. No. WP7/2015/02.
Two mathematical models formalizing the decision-making process by a trader on developing and changing her investment portfolio in a stock exchange are presented. According to the first model the trader can correctly predict future values of financial securities of her interest. In this case, the problem of finding optimal strategies of investing in these financial ...
Added: May 31, 2015
Belenky A., Procedia Computer Science 2014 Vol. 31 P. 1150–1159
A part of a country’s electrical grid in which an electricity generator (which may consist of several base load power plants
and several peaking power plants) supplies electricity to a set of large customers of the grid, whereas the customers can a)
receive electricity from renewable sources of energy, b) store electricity in certain volumes, and c) ...
Added: September 29, 2014