The paper proposes a new stochastic intervention control model conducted in various commodity and stock markets. The essence of the phenomenon of intervention is described in accordance with current economic theory. A review of papers on intervention research has been made. A general construction of the stochastic intervention model was developed as a Markov process with discrete time, controlled at the time it hits the boundary of a given subset of a set of states. Thus, the problem of optimal control of interventions is reduced to a theoretical problem of control by the specified process or the problem of tuning. A general solution of the tuning problem for a model with discrete time is obtained. It is proved that the optimal control in such a problem is deterministic and is determined by the global maximum point of the function of two discrete variables, for which an explicit analytical representation is obtained. It is noted that the solution of the stochastic tuning problem can be used as a basis for solving control problems of various technical systems in which there is a need to maintain some main parameter in a given set of its values.

This work establishes fast rates of convergence for empirical barycenters over a large class of geodesic spaces with curvature bounds in the sense of Alexandrov. More specifically, we show that parametric rates of convergence are achievable under natural conditions that characterize the bi-extendibility of geodesics emanating from a barycenter. These results largely advance the state-of-the-art on the subject both in terms of rates of convergence and the variety of spaces covered. In particular, our results apply to infinite-dimensional spaces such as the 2-Wasserstein space, where bi-extendibility of geodesics translates into regularity of Kantorovich potentials.

We propose a classification of symmetric conservative clones with a finite carrier. For the study, we use the functional Galois connection (*I**n**v_**Q*,*P**o**l_**Q*), which is a natural modification of the connection (*I**n**v*,*P**o**l*) based on the preservation relation between functions *f* on a set *A* (of all finite arities) and sets of functions *h*∈*A**Q* for an arbitrary set *Q*.

Using Okounkov's *q*-integral representation of Macdonald polynomials we construct an infinite sequence Ω1,Ω2,Ω3,… of countable sets linked by transition probabilities from Ω*N* to Ω*N*−1 for each *N*=2,3,…. The elements of the sets Ω*N* are the vertices of the extended Gelfand-Tsetlin graph, and the transition probabilities depend on the two Macdonald parameters, *q* and *t*. These data determine a family of Markov chains, and the main result is the description of their entrance boundaries. This work has its origin in asymptotic representation theory. In the subsequent paper, the main result is applied to large-*N* limit transition in (*q*,*t*)-deformed *N*-particle beta-ensembles.

This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain. It is a cornerstone in the analysis of stochastic algorithms in machine learning (e.g. for parameter tracking in online-learning or reinforcement learning). The existing results impose strong conditions such as uniform boundedness of the matrix-valued functions and uniform ergodicity of the Markov chains. Our main contribution is an exponential stability result for the p-th moment of random matrix product, provided that (i) the underlying Markov chain satisfies a super-Lyapunov drift condition, (ii) the growth of the matrix-valued functions is controlled by an appropriately defined function (related to the drift condition). Using this result, we give finite-time p-th moment bounds for constant and decreasing stepsize linear stochastic approximation schemes with Markovian noise on general state space. We illustrate these findings for linear value-function estimation in reinforcement learning. We provide finite-time p-th moment bound for various members of temporal difference (TD) family of algorithms.

There exist two known canonical concepts of ultrafilter extensions of first-order models; one comes from modal logic and universal algebra, another one from model theory and algebra of ultrafilters, with ultrafilter extensions of semigroups as its main precursor. By a classical fact of general topology, the space of ultrafilters over a discrete space is its largest compactification; the ultrafilter extensions generalize this fact to discrete spaces endowed with an arbitrary first-order structure. Results of such type are referred to as extension theorems. We offer a uniform approach to both types of extensions based on the idea to extend the extension procedure itself. We propose a generalization of the standard concept of first-order models in which functional and relational symbols are interpreted rather by ultrafilters over sets of functions and relations than by functions and relations themselves, and an appropriate semantic for generalized models of this form. We establish necessary and sufficient conditions under which generalized models are the canonical ultrafilter extensions of some ordinary models, and provide their topological characterization. Then we provide even a wider concept of ultrafilter interpretations together with their semantic based on limits of ultrafilters, and show that it absorbs the former concept as well as the ordinary concept of models. We establish necessary and sufficient conditions under which generalized models in the wide sense are those in the narrow sense, or are the ultrafilter extensions of some ordinary models, and establish extension theorems.

In this paper, a general stochastic model with controls applied at the moments when the random process hits the boundary of a given subset of the state set is proposed and studied. The general concept of the model is formulated and its possible applications in technical and economic systems are described. Two versions of the general stochastic model, the version based on the use of a continuous-time semi-Markov process with embedded absorbing Markov chain and the version based on the use of a discrete-time Markov process with absorbing states, are analyzed. New representations of the stationary cost index of the control quality are obtained for both versions. It is shown that this index can be represented as a linear-fractional integral functional of two discrete probability distributions determining the control strategy. The results obtained by the author of this paper about an extremum of such functionals were used to prove that, in both versions of the model, the control is deterministic and is determined by the extremum points of functions of two discrete arguments for which the explicit analytic representations are obtained. The perspectives of the further development of this stochastic model are outlined.

We undertake a precise study of the non-asymptotic properties of vanilla generative adversarial networks (GANs) and derive theoretical guarantees in the problem of estimating an unknown $d$-dimensional density $p$ under a proper choice of the class of generators and discriminators. We prove that the resulting density estimate converges to $p$ in terms of Jensen-Shannon (JS) divergence at the rate $n^{-2\beta/(2\beta+d)}$ where $n$ is the sample size and $\beta$ determines the smoothness of $p$. This is the first result in the literature on density estimation using vanilla GANs with JS rates faster than $n^{-1/2}$ in the regime $\beta>d/2$.

The paper is devoted to the study of the unconditional extremal problem for a fractional linearintegral functional defined on a set of probability distributions. In contrast to results proved earlier,the integrands of the integral expressions in the numeratorand the denominator in the problem underconsideration depend on a real optimization parameter vector. Thus, the optimization problem isstudied on the Cartesian product of a set of probability distributions and a set of admissible values ofa real parameter vector. Three statements on the extremum ofa fractional linear integral functionalare proved. It is established that, in all the variants, the solution of the original problem is completelydetermined by the extremal properties of the test function of the linear-fractional integral functional;this function is the ratio of the integrands of the numeratorand the denominator. Possible applicationsof the results obtained to problems of optimal control of stochastic systems are described.

This paper addresses the problem of prediction with expert advice for outcomes in a geodesic space with non-positive curvature in the sense of Alexandrov. Via geometric considerations, and in particular the notion of barycenters, we extend to this setting the definition and analysis of the classical exponentially weighted average forecaster. We also adapt the principle of online to batch conversion to this setting. We shortly discuss the application of these results in the context of aggregation and for the problem of barycenter estimation.

This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear system $\bar{A}\theta = \bar{b}$ for which $\bar{A}$ and $\bar{b}$ can only be accessed through random estimates $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$. Our analysis is based on new results regarding moments and high probability bounds for products of matrices which are shown to be tight. We derive high probability bounds on the performance of LSA under weaker conditions on the sequence $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$ than previous works. However, in contrast, we establish polynomial concentration bounds with order depending on the stepsize. We show that our conclusions cannot be improved without additional assumptions on the sequence of random matrices $\{{\bf A}_n: n \in \mathbb{N}^*\}$, and in particular that no Gaussian or exponential high probability bounds can hold. Finally, we pay a particular attention to establishing bounds with sharp order with respect to the number of iterations and the stepsize and whose leading terms contain the covariance matrices appearing in the central limit theorems.

Policy evaluation is an important instrument for the comparison of different algorithms in Reinforcement Learning (RL). Yet even a precise knowledge of the value function $V^{\pi}$ corresponding to a policy $\pi$ does not provide reliable information on how far is the policy $\pi$ from the optimal one. We present a novel model-free upper value iteration procedure ({\sf UVIP}) that allows us to estimate the suboptimality gap $V^{\star}(x) - V^{\pi}(x)$ from above and to construct confidence intervals for \(V^\star\). Our approach relies on upper bounds to the solution of the Bellman optimality equation via martingale approach. We provide theoretical guarantees for {\sf UVIP} under general assumptions and illustrate its performance on a number of benchmark RL problems.

In this paper we propose an efficient variance reduction approach for additive functionals of Markov chains relying on a novel discrete time martingale representation. Our approach is fully non-asymptotic and does not require the knowledge of the stationary distribution (and even any type of ergodicity) or specific structure of the underlying density. By rigorously analyzing the convergence properties of the proposed algorithm, we show that its cost-to-variance product is indeed smaller than one of the naive algorithm. The numerical performance of the new method is illustrated for the Langevin-type Markov Chain Monte Carlo (MCMC) methods.

In this paper we propose and study a generic variance reduction approach. The proposed method is based on minimization of the empirical variance over a suitable class of zero mean control functionals. We discuss several possibilities of constructing zero mean control functionals and present non-asymptotic error bounds for the variance reduced Monte Carlo estimates. Finally, a simulation study showing nu- merical eciency of the proposed approach is presented.