Variance reduction for dependent sequences with applications to Stochastic Gradient MCMC
In this paper we propose a novel and practical variance reduction approach for additive functionals of dependent sequences. Our approach combines the use of control variates with the minimisation of an empirical variance estimate. We analyse finite sample properties of the proposed method and derive finite-time bounds of the excess asymptotic variance to zero. We apply our methodology to Stochastic Gradient MCMC (SGMCMC) methods for Bayesian inference on large data sets and combine it with existing variance reduction methods for SGMCMC. We present empirical results carried out on a number of benchmark examples showing that our variance reduction method achieves significant improvement as compared to state-of-the-art methods at the expense of a moderate increase of computational overhead.
In this paper we propose a novel variance reduction approach for additive functionals of Markov chains based on minimization of an estimate for the asymptotic variance of these functionals over suitable classes of control variates. A distinctive feature of the proposed approach is its ability to significantly reduce the overall finite sample variance. This feature is theoretically demonstrated by means of a deep non asymptotic analysis of a variance reduced functional as well as by a thorough simulation study. In particular we apply our method to various MCMC Bayesian estimation problems where it favourably compares to the existing variance reduction approaches.
In this paper, we propose a unified analysis of variants of distributed SGD with arbitrary compressions and delayed updates. Our framework is general enough to cover different variants of quantized SGD, Error-Compensated SGD (EC-SGD), and SGD with delayed updates (D-SGD). Via single theorem, we derive the complexity results for all the methods that fit our framework. For the existing methods, this theorem gives the best-known complexity results. Moreover, using our general scheme, we develop new variants of SGD that combine variance reduction or arbitrary sampling with error feedback and quantization and derive the convergence rates for these methods beating the state-of-the-art results. In order to illustrate the strength of our framework, we develop 16 new methods that fit this. In particular, we propose the first method called EC-SGD-DIANA that is based on error-feedback for biased compression operator and quantization of gradient differences and prove the convergence guarantees showing that EC-SGD-DIANA converges to the exact optimum asymptotically in expectation with constant learning rate for both convex and strongly convex objectives when workers compute full gradients of their loss functions. Moreover, for the case when the loss function of the worker has the form of finite sum, we modified the method and got a new one called EC-LSVRG-DIANA which is the first distributed stochastic method with error feedback and variance reduction that converges to the exact optimum asymptotically in expectation with constant learning rate.
In this paper we propose an efficient variance reduction approach for additive functionals of Markov chains relying on a novel discrete time martingale representation. Our approach is fully non-asymptotic and does not require the knowledge of the stationary distribution (and even any type of ergodicity) or specific structure of the underlying density. By rigorously analyzing the convergence properties of the proposed algorithm, we show that its cost-to-variance product is indeed smaller than one of the naive algorithm. The numerical performance of the new method is illustrated for the Langevin-type Markov Chain Monte Carlo (MCMC) methods.
We consider certain spaces of functions on the circle, which naturally appear in harmonic analysis, and superposition operators on these spaces. We study the following question: which functions have the property that each their superposition with a homeomorphism of the circle belongs to a given space? We also study the multidimensional case.
We consider the spaces of functions on the m-dimensional torus, whose Fourier transform is p -summable. We obtain estimates for the norms of the exponential functions deformed by a C1 -smooth phase. The results generalize to the multidimensional case the one-dimensional results obtained by the author earlier in “Quantitative estimates in the Beurling—Helson theorem”, Sbornik: Mathematics, 201:12 (2010), 1811 – 1836.
We consider the spaces of function on the circle whose Fourier transform is p-summable. We obtain estimates for the norms of exponential functions deformed by a C1 -smooth phase.
For a class of optimal control problems and Hamiltonian systems generated by these problems in the space l 2, we prove the existence of extremals with a countable number of switchings on a finite time interval. The optimal synthesis that we construct in the space l 2 forms a fiber bundle with piecewise smooth two-dimensional fibers consisting of extremals with a countable number of switchings over an infinite-dimensional basis of singular extremals.
The problem of minimizing the root mean square deviation of a uniform string with clamped ends from an equilibrium position is investigated. It is assumed that the initial conditions are specified and the ends of the string are clamped. The Fourier method is used, which enables the control problem with a partial differential equation to be reduced to a control problem with a denumerable system of ordinary differential equations. For the optimal control problem in the l2 space obtained, it is proved that the optimal synthesis contains singular trajectories and chattering trajectories. For the initial problem of the optimal control of the vibrations of a string it is also proved that there is a unique solution for which the optimal control has a denumerable number of switchings in a finite time interval.
This proceedings publication is a compilation of selected contributions from the “Third International Conference on the Dynamics of Information Systems” which took place at the University of Florida, Gainesville, February 16–18, 2011. The purpose of this conference was to bring together scientists and engineers from industry, government, and academia in order to exchange new discoveries and results in a broad range of topics relevant to the theory and practice of dynamics of information systems. Dynamics of Information Systems: Mathematical Foundation presents state-of-the art research and is intended for graduate students and researchers interested in some of the most recent discoveries in information theory and dynamical systems. Scientists in other disciplines may also benefit from the applications of new developments to their own area of study.
In this paper, we construct a new distribution corresponding to a real noble gas as well as the equation of state for it.