Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability

S. Samsonov; D. Tiapkin; A. Naumov; E. Moulines

Publications

?

Improved High-Probability Bounds for the Temporal Difference Learning Algorithm via Exponential Stability

Ch. 247. P. 4511–4547.

Samsonov S., Tiapkin D., Naumov A., Moulines E.

Language: English

Text on another site

Keywords: GTD learning linear stochastic approximation стохастическая аппроксимация Random matrix products

Publication based on the results of:

Development and theoretical analysis of new effective stochastic machine learning algorithms (2024)

In book

Proceedings of Machine Learning Research. Volume 247: The Thirty Seventh Annual Conference on Learning Theory, 30-3 July 2023, Edmonton, Canada

PMLR, 2024.

High-Order Error Bounds for Markovian LSA with Richardson–Romberg Extrapolation

Levin I., Naumov A., Samsonov S., , in: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI-26: AAAI Technical Track on Planning, Routing, and Scheduling; AAAI Technical Track on Reasoning under Uncertainty; AAAI Technical Track on Search and Optimization. Main Track, volume 40 no. 43.: American Association for Artificial Intelligence (AAAI) Press, 2026. P. 36696–36704.

In this paper, we study the bias and high-order error bounds of the Linear Stochastic Approximation (LSA) algorithm with Polyak-Ruppert (PR) averaging under Markovian noise. We focus on the version of the algorithm with constant step size and propose a novel decomposition of the bias via a linearization technique. We analyze the structure of the ...

Added: April 17, 2026

SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning

Mangold P., Samsonov S., Labbi S. et al., , in: 38th Conference on Neural Information Processing Systems (NeurIPS 2024).: [б.и.], 2024. Ch. 37 P. 13927–13981.

In this paper, we analyze the sample and communication complexity of the federated linear stochastic approximation (FedLSA) algorithm. We explicitly quantify the effects of local training with agent heterogeneity. We show that the communication complexity of FedLSA scales polynomially with the inverse of the desired accuracy ϵ. To overcome this, we propose SCAFFLSA a new ...

Added: February 11, 2025

Finite-Time High-Probability Bounds for Polyak–Ruppert Averaged Iterates of Linear Stochastic Approximation

Durmus A., Moulines E., Naumov A. et al., Mathematics of Operations Research 2025 Vol. 50 No. 2 P. 935–964

This paper provides a finite-time analysis of linear stochastic approximation (LSA) algorithms with fixed step size, a core method in statistics and machine learning. LSA is used to compute approximate solutions of a $d$-dimensional linear system $\bar{\mathbf{A}} \theta = \bar{\mathbf{b}}$, for which $(\bar{\mathbf{A}}, \bar{\mathbf{b}})$ can only be estimated through (asymptotically) unbiased observations $\{(\mathbf{A}(Z_n),\mathbf{b}(Z_n))\}_{n \in \mathbb{N}}$. ...

Added: July 13, 2022

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

Durmus A., Moulines E., Naumov A. et al., , in: Advances in Neural Information Processing Systems 34 (NeurIPS 2021).: Curran Associates, Inc., 2021. P. 30063–30074.

This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear system $\bar{A}\theta = \bar{b}$ for which $\bar{A}$ and $\bar{b}$ can only be accessed through random estimates $\{({\bf A}_n, {\bf b}_n): ...

Added: February 17, 2022

On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning

Durmus A., Moulines E., Naumov A. et al., , in: Proceedings of Machine Learning ResearchVol. 134: Conference on Learning Theory.: PMLR, 2021. P. 1711–1752.

Added: August 6, 2021

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

Kaledin M., Moulines E., Naumov A. et al., , in: Proceedings of Machine Learning ResearchVol. 125: Proceedings of Thirty Third Conference on Learning Theory.: [б.и.], 2020. P. 2144–2203.

Added: July 30, 2020