A pseudo-polynomial algorithm for mean payoff stochastic games with perfect information and few random positions
We consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph G=(V,E), with local rewards r:E→Z, and three types of positions: black VB, white VW, and random VR forming a partition of V. It is a long-standing open question whether a polynomial time algorithm for BWR-games exists, or not, even when |VR|=0. In fact, a pseudo-polynomial algorithm for BWR-games would already imply their polynomial solvability. In this paper,1 we show that BWR-games with a constant number of random positions can be solved in pseudo-polynomial time. More precisely, in any BWR-game with |VR|=O(1), a saddle point in uniformly optimal pure stationary strategies can be found in time polynomial in |VW|+|VB|, the maximum absolute local reward, and the common denominator of the transition probabilities.