• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Preprints
  • Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
April 30, 2026
HSE Researchers Compile Scientific Database for Studying Childrens Eating Habits
The database created at HSE University can serve as a foundation for studying children’s eating habits. This is outlined in the study ‘The Influence of Age, Gender, and Social-Role Factors on Children’s Compliance with Age-Based Nutritional Norms: An Experimental Study Using the Dish-I-Wish Web Application.’ The work has been carried out as part of the HSE Basic Research Programme and was presented at the XXVI April International Academic Conference named after Evgeny Yasin.
April 30, 2026
New Foresight Centre Study Identifies the Most Destructive Global Trends for Humankind
A team of researchers from the HSE International Research and Educational Foresight Centre has examined how global trends affect the quality of human life—from life expectancy to professional fulfilment. The findings of the study titled ‘Human Capital Transformation under the Influence of Global Trends’ were published in Foresight.
April 28, 2026
Scientists Develop Algorithm for Accurate Financial Time Series Forecasting
Researchers at the HSE Faculty of Computer Science benchmarked more than 200,000 model configurations for predicting financial asset prices and realised volatility, showing that performance can be improved by filtering out noise at specific frequencies in advance. This technique increased accuracy in 65% of cases. The authors also developed their own algorithm, which achieves accuracy comparable to that of the best models while requiring less computational power. The study has been published in Applied Soft Computing.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation

2024.
Sheshukova M., Belomestny D., Durmus A., Moulines E., Naumov A., Samsonov S.
We address the problem of solving strongly convex and smooth minimization problems using stochastic gradient descent (SGD) algorithm with a constant step size. Previous works suggested to combine the Polyak-Ruppert averaging procedure with the Richardson-Romberg extrapolation technique to reduce the asymptotic bias of SGD at the expense of a mild increase of the variance. We significantly extend previous results by providing an expansion of the mean-squared error of the resulting estimator with respect to the number of iterations n. More precisely, we show that the mean-squared error can be decomposed into the sum of two terms: a leading one of order $n^{-1/2}$ with explicit dependence on a minimax-optimal asymptotic covariance matrix, and a second-order term of order $n^{-3/4}$ where the power 3/4 can not be improved in general. We also extend this result to the p-th moment bound keeping optimal scaling of the remainders with respect to n. Our analysis relies on the properties of the SGD iterates viewed as a time-homogeneous Markov chain. In particular, we establish that this chain is geometrically ergodic with respect to a suitably defined weighted Wasserstein semimetric.
Priority areas: IT and mathematics mathematics
Language: English
DOI
Text on another site
Keywords: цепи МарковаMarkov chainsRichardson-Romberg extrapolationstochastic gradient descentстохастический градиентный спускPolyak-RuppertРичардсон-Ромберг
Similar publications
Natural hazard database from Internet publications: text mining with a large language model
Derkacheva A., Sakirkina M., Kraev G. et al., /. 2026.
Comprehensive data on natural hazards and their consequences are crucial for effective for risk assessment, adaptation planning, and emergency response. However, many countries face challenges with fragmented, inconsistent, and inaccessible data, particularly regarding local-scale events. To address this data gap in Russia, we developed an end-to-end processing pipeline that scrapes news from various online sources, ...
Added: April 28, 2026
Ising models on the hydrogen peroxide and other lattices
Qin X., Deng Y., Shchur L. et al., / Series arXiv "math". 2026. No. 2603.02962.
We perform a Monte Carlo analysis of the Ising model on many three-dimensional lattices. By means of finite-size scaling we obtain the critical points and determine the scaling dimensions. As expected, the critical exponents agree with the three-dimensional Ising universality class for all models. The irrelevant field, as revealed by the correction-to-scaling amplitudes, appears to ...
Added: April 20, 2026
Algorithmic overlaps as thermodynamic variables: from local to cluster Monte Carlo dynamics in critical phenomena
Pilé I., Deng Y., Shchur L., / Series arXiv "math". 2026. No. 2604.10254.
We investigate the spatial overlap of successive spin configurations in Markov chain Monte Carlo simulations using the local Metropolis algorithm and the Svendsen-Wang and Wolff cluster algorithms. We examine the dynamics of these algorithms for two models in different universality classes: the Ising model and the Potts model with three components. The overlap of two ...
Added: April 20, 2026
On weak solutions to the 1d compressible Navier-Stokes equations: a Lipschitz continuous dependence on data in weaker norms and an error of their homogenization
Zlotnik Alexander, / Series arXiv "math". 2026. No. 2602.03481v1.
We deal with the global in time weak solutions to the 1D compressible Navier-Stokes system of equations for large discontinuous initial data and nonhomogeneous boundary conditions of three standard types. We prove the Lipschitz-type continuous dependence of the solution $(\eta,u,\theta)$, in a norm slightly stronger than $L^{2,\infty}(Q)\times L^2(Q)\times L^2(Q)$,  on the initial data $(\eta^0,u^0,e^0)$ in a ...
Added: April 18, 2026
High-Order Error Bounds for Markovian LSA with Richardson–Romberg Extrapolation
Levin I., Naumov A., Samsonov S., , in: AAAI '26: Proceedings of the 40th Annual AAAI Conference on Artificial Intelligence.: American Association for Artificial Intelligence (AAAI) Press, 2026. P. 36696–36704.
In this paper, we study the bias and high-order error bounds of the Linear Stochastic Approximation (LSA) algorithm with Polyak-Ruppert (PR) averaging under Markovian noise. We focus on the version of the algorithm with constant step size and propose a novel decomposition of the bias via a linearization technique. We analyze the structure of the ...
Added: April 17, 2026
On the dimension of the space of static potentials on three-manifolds
Medvedev V., / Series arXiv "math". 2026.
We investigate the interplay between the dimension of the space of static potentials and the geometric and topological structure of the underlying static three-manifold. A partial classification of boundaryless static manifolds is obtained in terms of this dimension. We also treat the case of static manifolds with boundary. In particular, we prove that if a ...
Added: April 3, 2026
Using predefined vector systems to speed up neural network multimillion class classification
Gabdullin N., Androsov I., / Series Computer Science "arxiv.org". 2026.
Label prediction in neural networks (NNs) has O(n) complexity proportional to the number of classes. This holds true for classification using fully connected layers and cosine similarity with some set of class prototypes. In this paper we show that if NN latent space (LS) geometry is known and possesses specific properties, label prediction complexity can ...
Added: April 2, 2026
Homogeneous maximizers of the Blaschke-Santalo-type functionals
Kolesnikov A., / Series arXiv "math". 2025.
We study Blaschke--Santal{ó}-type inequalities for N>=2  sets (functions) and a special class of cost functions. In particular, we prove new results about reduction of the maximization problem for the Blaschke--Santal{ó}-type functional to homogeneous case (functional inequalities on the sphere) and extend the symmetrization argument to the case of  N>2 sets. We also discuss links to the ...
Added: February 13, 2026
Об одном применении теоремы А.Н. Колмогорова
Соболев В. Н., Фролов А. А., Чебышевский сборник 2025 Т. 26 № 5 С. 203–220
In the article, on the class K 0 of infinite binary sequences without the runs of ones, a consistent probability distribution P is constructed which is induced by a time-homogeneous Markov chain with a one-step transition matrix P𝜑 , and is completely determined by the golden ratio 𝜑. Using a Markov chain to construct a probability measure P ...
Added: February 11, 2026
Iterative Ricci-Foster Curvature Flow with GMM-Based Edge Pruning: A Novel Approach to Community Detection
Sorokin K., Beketov M., Онучин А. et al., / arxiv.org. Серия cs.SI "Social and Information Networks ". 2025.
Community detection in complex networks is a fundamental problem, open to new approaches in various scientific settings. We introduce a novel community detection method, based on Ricci flow on graphs. Our technique iteratively updates edge weights (their metric lengths) according to their (combinatorial) Foster version of Ricci curvature computed from effective resistance distance between the ...
Added: January 15, 2026
On finding formal power-logarithmic expansions of solutions to q-difference equations
Gaianov N., Parusnikova A., / Cornell University. Серия math "arxiv.org". 2025.
An algebraic q-difference equation is considered. A sufficient condition for the existence of a formal power-logarithmic expansion of a solution to such an equation in the neighborhood of zero is proposed. An example of applying this sufficient condition for constructing a formal expansion of a solution to a certain q-difference analogue of the fifth Painlevé equation ...
Added: December 25, 2025
Implementing Transport Coding in OMNeT++ for Message Delay Reduction
Petrovanov I., Sergeev A., / Series Computer Science "arxiv.org". 2025. No. 2512.18332.
Transport coding reduces message delay in packet-switched networks by introducing controlled redundancy at the transport layer:  original packets are encoded into  coded packets, and the message is reconstructed after the first  successful deliveries, effectively shifting latency from the maximum packet delay to the -th order statistic. We present a concise, reproducible discrete-event implementation of transport coding in OMNeT++, including ...
Added: December 24, 2025
Ideal of the variety of flexes of plane cubics
Popov V., / Series arXiv "math". 2025. No. 2502.01539.
We prove that the variety of flexes of algebraic curves of degree 3 in the projective plane is an ideal theoretic complete intersection in the product of a two-dimensional and a nine-dimensional projective spaces. ...
Added: December 16, 2025
Random walks on rank one symmetric spaces of noncompact type
Gnetov F., Konakov V., / Series arXiv "math". 2025. No. 2512.04667.
We establish a central limit theorem, a local limit theorem, and a law of large numbers for a natural random walk on a symmetric space M of non-compact type and rank one. This class of spaces, which includes the complex and quaternionic hyperbolic spaces and the Cayley hyperbolic plane, generalizes the real hyperbolic space Hn. Our approach introduces ...
Added: December 5, 2025
Cascades of Lorenz attractors in the Shimizu-Morioka model
Kazakov A., Koryakin V., Safonov K. et al., / Series arXiv "math". 2025.
The Lorenz attractor is the first example of a robustly chaotic non-hyperbolic attractor. Each orbit of such an attractor has a positive top Lyapunov exponent, and this property persists under small perturbations despite possible bifurcations of the attractor. In this paper, we study the boundary of the Lorenz attractor existence region in the Shimizu-Morioka model. ...
Added: December 4, 2025
Асимптотический вариант метода параметрикс для цепей Маркова, сходящихся к диффузиям
Bitter I., Konakov V., / Cornell University. Серия arXiv "math". 2025. № 2505.24548.
В работе приводится обобщение локальной предельной теоремы о сходимости неоднородных цепей Маркова к диффузионному пределу на случай, когда соответ- ствующие коэффициенты процессов удовлетворяют слабым условиям регулярности и совпадают лишь асимптотически. В частности, рассматриваемые нами коэффици- енты сноса могут быть неограниченными с не более чем линейным ростом, а оценки отражают перенос терминального состояния неограниченным трендом через ...
Added: December 3, 2025
A Revival of Conservative Ideology and the Projected Religious Landscape in Russia
Skorobogatov A., Economics of Transition and Institutional Change 2026 Vol. 34 No. 2 P. 387–409
This paper analyzes the dynamics of the public attitude towards religion using longitudinal data from Russian respondents. Applying Markov chains and regression analysis, we determine the relative success of religious groups in retaining and attracting members. Based on this information, we estimate and explain the projected religious composition of Russia. According to our results, the ...
Added: November 3, 2025
Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation
Sheshukova M., Belomestny D., Durmus A. et al., , in: Proceedings of the 13th International Conference on Learning Representations (ICLR 2025).: ICLR, 2025.
Added: August 15, 2025
Refined Analysis of Constant Step Size Federated Averaging and Federated Richardson-Romberg Extrapolation
Paul M., Durmus A., Dieuleveut A. et al., , in: Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 3-5 May 2025, Splash Beach Resort in Mai Khao, Thailand, PMLR: vol. 258Vol. 258.: PMLR, 2025. Ch. 258 P. 5023–5031.
In this paper, we present a novel analysis of FedAvg with constant step size, relying on the Markov property of the underlying process. We demonstrate that the global iterates of the algorithm converge to a stationary distribution and analyze its resulting bias and variance relative to the problem’s solution. We provide a first-order bias expansion in ...
Added: May 18, 2025
Rosenthal-type inequalities for linear statistics of Markov chains
Durmus A., Moulines E., Naumov A. et al., / Series arXiv "math". 2023.
In this paper, we establish novel deviation bounds for additive functionals of geometrically ergodic Markov chains similar to Rosenthal and Bernstein-type inequalities for sums of independent random variables. We pay special attention to the dependence of our bounds on the mixing time of the corresponding chain. Our proof technique is, as far as we know, ...
Added: June 18, 2023
Local Limit Theorems and Strong Approximations for Robbins-Monro Procedures
Konakov V., Mammen E., / Series arXiv "math". 2023. No. 2304.10673.
The Robbins-Monro algorithm is a recursive, simulation-based stochastic procedure to approximate the zeros of a function that can be written as an expectation. It is known that under some technical assumptions, Gaussian limit distributions approximate the stochastic performance of the algorithm. Here, we are interested in strong approximations for Robbins-Monro procedures. The main tool for ...
Added: April 24, 2023
Local-Global MCMC kernels: the best of both worlds
Samsonov S., Lagutin E., Gabrie M. et al., , in: Thirty-Sixth Conference on Neural Information Processing Systems : NeurIPS 2022.: Curran Associates, Inc., 2022. P. 5178–5193.
Added: February 1, 2023
BR-SNIS: Bias Reduced Self-Normalized Importance Sampling
Cardoso G., Samsonov S., Thin A. et al., , in: Thirty-Sixth Conference on Neural Information Processing Systems : NeurIPS 2022.: Curran Associates, Inc., 2022. P. 716–729.
Added: February 1, 2023
Mathematical Model for Assessing the Reliability of Water Supply Networks
Runev E. V., Springer Nature Switzerland 2022 Vol. 402 No. 1 P. 343–351
The book presents latest developments in the field of high-speed railway, Hyperloop transportation technologies and Maglev system. In recent years, railway transport has received a powerful impetus in its development. With the advent of the 4th Industrial revolution, the transport sector is moving towards full digitalization. TransSiberia is a platform where both the rail industry ...
Added: November 1, 2022
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit