• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Preprints
  • Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation

2024.
Sheshukova M., Belomestny D., Durmus A., Moulines E., Naumov A., Samsonov S.
We address the problem of solving strongly convex and smooth minimization problems using stochastic gradient descent (SGD) algorithm with a constant step size. Previous works suggested to combine the Polyak-Ruppert averaging procedure with the Richardson-Romberg extrapolation technique to reduce the asymptotic bias of SGD at the expense of a mild increase of the variance. We significantly extend previous results by providing an expansion of the mean-squared error of the resulting estimator with respect to the number of iterations n. More precisely, we show that the mean-squared error can be decomposed into the sum of two terms: a leading one of order $n^{-1/2}$ with explicit dependence on a minimax-optimal asymptotic covariance matrix, and a second-order term of order $n^{-3/4}$ where the power 3/4 can not be improved in general. We also extend this result to the p-th moment bound keeping optimal scaling of the remainders with respect to n. Our analysis relies on the properties of the SGD iterates viewed as a time-homogeneous Markov chain. In particular, we establish that this chain is geometrically ergodic with respect to a suitably defined weighted Wasserstein semimetric.
Priority areas: IT and mathematics mathematics
Language: English
DOI
Text on another site
Keywords: цепи МарковаMarkov chainsRichardson-Romberg extrapolationstochastic gradient descentстохастический градиентный спускPolyak-RuppertРичардсон-Ромберг
Similar publications
ML-based Fast Simulation of FARICH Responses
Shipilov F., Barnyakov A., Ivanov A. et al., / Series Physics "arxiv.org". 2026.
A fast simulation of the detector response is a vital task in high-energy physics (HEP). Traditional Monte-Carlo methods form the backbone of modern particle physics simulation software but are computationally expensive. We present a machine-learning-based approach to fast simulation of the Focusing Aerogel Ring Imaging Cherenkov (FARICH) detector response. Given a particle track and momentum, ...
Added: May 19, 2026
Bifurcations and Structural Stability of Generic PC-HC Families
Dorovskiy A., / Series arXiv "math". 2026.
In this paper the structural stability of generic families of vector fields of the PC-HC class on the two-dimensional sphere is proved. A classification of these families up to moderate equivalence in neighborhoods of their large bifurcation supports is presented, based on such invariants as the configuration and the characteristic set. The realization lemma is proved. ...
Added: May 14, 2026
On the minimum number of maximal distance-k independent sets in trees
Taletskii D., / Series arXiv "math". 2026.
A vertex subset of a graph is called a \textit{distance-$k$ independent set} if the distance between any two of its distinct vertices is at least $k + 1$. For all $n,k \geq 1$, we determine the minimum possible number of inclusion-wise maximal distance-$k$ independent sets among all $n$-vertex trees. It equals~$n$ if $n \leq k ...
Added: May 1, 2026
On Arithmetic Mirror Symmetry for smooth Fano fourfolds
Ovcharenko M., / Series arXiv "math". 2026.
We introduce an explicit class of tempered Laurent polynomials in the sense of Villegas and Doran--Kerr in n⩽4 variables including all Landau--Ginzburg models for smooth Fano threefolds with very ample anticanonical class. We check that it contains Landau--Ginzburg models for various Fano fourfolds which are complete intersections in smooth toric varieties and Grassmannians of planes, ...
Added: April 30, 2026
Natural hazard database from Internet publications: text mining with a large language model
Derkacheva A., Sakirkina M., Kraev G. et al., /. 2026.
Comprehensive data on natural hazards and their consequences are crucial for effective for risk assessment, adaptation planning, and emergency response. However, many countries face challenges with fragmented, inconsistent, and inaccessible data, particularly regarding local-scale events. To address this data gap in Russia, we developed an end-to-end processing pipeline that scrapes news from various online sources, ...
Added: April 28, 2026
Algorithmic overlaps as thermodynamic variables: from local to cluster Monte Carlo dynamics in critical phenomena
Pilé I., Deng Y., Shchur L., / Series arXiv "math". 2026. No. 2604.10254.
We investigate the spatial overlap of successive spin configurations in Markov chain Monte Carlo simulations using the local Metropolis algorithm and the Svendsen-Wang and Wolff cluster algorithms. We examine the dynamics of these algorithms for two models in different universality classes: the Ising model and the Potts model with three components. The overlap of two ...
Added: April 20, 2026
On weak solutions to the 1d compressible Navier-Stokes equations: a Lipschitz continuous dependence on data in weaker norms and an error of their homogenization
Zlotnik Alexander, / Series arXiv "math". 2026. No. 2602.03481v1.
We deal with the global in time weak solutions to the 1D compressible Navier-Stokes system of equations for large discontinuous initial data and nonhomogeneous boundary conditions of three standard types. We prove the Lipschitz-type continuous dependence of the solution $(\eta,u,\theta)$, in a norm slightly stronger than $L^{2,\infty}(Q)\times L^2(Q)\times L^2(Q)$,  on the initial data $(\eta^0,u^0,e^0)$ in a ...
Added: April 18, 2026
High-Order Error Bounds for Markovian LSA with Richardson–Romberg Extrapolation
Levin I., Naumov A., Samsonov S., , in: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI-26: AAAI Technical Track on Planning, Routing, and Scheduling; AAAI Technical Track on Reasoning under Uncertainty; AAAI Technical Track on Search and Optimization. Main Track, volume 40 no. 43.: American Association for Artificial Intelligence (AAAI) Press, 2026. P. 36696–36704.
In this paper, we study the bias and high-order error bounds of the Linear Stochastic Approximation (LSA) algorithm with Polyak-Ruppert (PR) averaging under Markovian noise. We focus on the version of the algorithm with constant step size and propose a novel decomposition of the bias via a linearization technique. We analyze the structure of the ...
Added: April 17, 2026
On the dimension of the space of static potentials on three-manifolds
Medvedev V., / Series arXiv "math". 2026.
We investigate the interplay between the dimension of the space of static potentials and the geometric and topological structure of the underlying static three-manifold. A partial classification of boundaryless static manifolds is obtained in terms of this dimension. We also treat the case of static manifolds with boundary. In particular, we prove that if a ...
Added: April 3, 2026
Using predefined vector systems to speed up neural network multimillion class classification
Gabdullin N., Androsov I., / Series Computer Science "arxiv.org". 2026.
Label prediction in neural networks (NNs) has O(n) complexity proportional to the number of classes. This holds true for classification using fully connected layers and cosine similarity with some set of class prototypes. In this paper we show that if NN latent space (LS) geometry is known and possesses specific properties, label prediction complexity can ...
Added: April 2, 2026
Homogeneous maximizers of the Blaschke-Santalo-type functionals
Kolesnikov A., / Series arXiv "math". 2025.
We study Blaschke--Santal{ó}-type inequalities for N>=2  sets (functions) and a special class of cost functions. In particular, we prove new results about reduction of the maximization problem for the Blaschke--Santal{ó}-type functional to homogeneous case (functional inequalities on the sphere) and extend the symmetrization argument to the case of  N>2 sets. We also discuss links to the ...
Added: February 13, 2026
Об одном применении теоремы А.Н. Колмогорова
Соболев В. Н., Фролов А. А., Чебышевский сборник 2025 Т. 26 № 5 С. 203–220
In the article, on the class K 0 of infinite binary sequences without the runs of ones, a consistent probability distribution P is constructed which is induced by a time-homogeneous Markov chain with a one-step transition matrix P𝜑 , and is completely determined by the golden ratio 𝜑. Using a Markov chain to construct a probability measure P ...
Added: February 11, 2026
Iterative Ricci-Foster Curvature Flow with GMM-Based Edge Pruning: A Novel Approach to Community Detection
Sorokin K., Beketov M., Онучин А. et al., / arxiv.org. Серия cs.SI "Social and Information Networks ". 2025.
Community detection in complex networks is a fundamental problem, open to new approaches in various scientific settings. We introduce a novel community detection method, based on Ricci flow on graphs. Our technique iteratively updates edge weights (their metric lengths) according to their (combinatorial) Foster version of Ricci curvature computed from effective resistance distance between the ...
Added: January 15, 2026
A Revival of Conservative Ideology and the Projected Religious Landscape in Russia
Skorobogatov A., Economics of Transition and Institutional Change 2026 Vol. 34 No. 2 P. 387–409
This paper analyzes the dynamics of the public attitude towards religion using longitudinal data from Russian respondents. Applying Markov chains and regression analysis, we determine the relative success of religious groups in retaining and attracting members. Based on this information, we estimate and explain the projected religious composition of Russia. According to our results, the ...
Added: November 3, 2025
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation
Sheshukova M., Belomestny D., Durmus A. et al., , in: Proceedings of the 13th International Conference on Learning Representations (ICLR 2025).: ICLR, 2025.
Added: August 15, 2025
Refined Analysis of Constant Step Size Federated Averaging and Federated Richardson-Romberg Extrapolation
Paul M., Durmus A., Dieuleveut A. et al., , in: Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 3-5 May 2025, Splash Beach Resort in Mai Khao, Thailand, PMLR: vol. 258Vol. 258.: PMLR, 2025. Ch. 258 P. 5023–5031.
In this paper, we present a novel analysis of FedAvg with constant step size, relying on the Markov property of the underlying process. We demonstrate that the global iterates of the algorithm converge to a stationary distribution and analyze its resulting bias and variance relative to the problem’s solution. We provide a first-order bias expansion in ...
Added: May 18, 2025
Rosenthal-type inequalities for linear statistics of Markov chains
Durmus A., Moulines E., Naumov A. et al., / Series arXiv "math". 2023.
In this paper, we establish novel deviation bounds for additive functionals of geometrically ergodic Markov chains similar to Rosenthal and Bernstein-type inequalities for sums of independent random variables. We pay special attention to the dependence of our bounds on the mixing time of the corresponding chain. Our proof technique is, as far as we know, ...
Added: June 18, 2023
Local Limit Theorems and Strong Approximations for Robbins-Monro Procedures
Konakov V., Mammen E., / Series arXiv "math". 2023. No. 2304.10673.
The Robbins-Monro algorithm is a recursive, simulation-based stochastic procedure to approximate the zeros of a function that can be written as an expectation. It is known that under some technical assumptions, Gaussian limit distributions approximate the stochastic performance of the algorithm. Here, we are interested in strong approximations for Robbins-Monro procedures. The main tool for ...
Added: April 24, 2023
Local-Global MCMC kernels: the best of both worlds
Samsonov S., Lagutin E., Gabrie M. et al., , in: Thirty-Sixth Conference on Neural Information Processing Systems : NeurIPS 2022.: Curran Associates, Inc., 2022. P. 5178–5193.
Added: February 1, 2023
BR-SNIS: Bias Reduced Self-Normalized Importance Sampling
Cardoso G., Samsonov S., Thin A. et al., , in: Thirty-Sixth Conference on Neural Information Processing Systems : NeurIPS 2022.: Curran Associates, Inc., 2022. P. 716–729.
Added: February 1, 2023
Mathematical Model for Assessing the Reliability of Water Supply Networks
Runev E. V., Springer Nature Switzerland 2022 Vol. 402 No. 1 P. 343–351
The book presents latest developments in the field of high-speed railway, Hyperloop transportation technologies and Maglev system. In recent years, railway transport has received a powerful impetus in its development. With the advent of the 4th Industrial revolution, the transport sector is moving towards full digitalization. TransSiberia is a platform where both the rail industry ...
Added: November 1, 2022
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit