### ?

## On Primal and Dual Approaches for Distributed Stochastic Convex Optimization over Networks

P. 7435-7440.

We introduce primal and dual stochastic gradient oracle methods for distributed convex optimization problems over networks. We show that the proposed methods are optimal (in terms of communication steps) for primal and dual oracles. Additionally, for a dual stochastic oracle, we propose a new analysis method for the rate of convergence in terms of duality gap and probability of large deviations. This analysis is based on a new technique that allows to bound the distance between the iteration sequence and the optimal point. By the proper choice of batch size, we can guarantee that this distance equals (up to a constant) to the distance between the starting point and the solution.

Ivanova A., Пасечнюк Д., Dvurechensky P. et al., / Cornell University. Серия "Working papers by Cornell University". 2019.

In this paper, we consider the resource allocation problem in a network with a large number of connections which are used by a huge number of users. The resource allocation problem, which we consider is a maximization problem with linear inequality constraints. To solve this problem we construct the dual problem and propose to use ...

Added: October 23, 2020

Puchkin N., Gorbunov E., Kutuzov N. et al., , in : Proceedings of The 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024), 2-4 May 2024, Palau de Congressos, Valencia, Spain. PMLR: Volume 238. Vol. 238.: Valencia : PMLR, 2024. P. 856-864.

We consider stochastic optimization problems with heavy-tailed noise with structured density. For such problems, we show that it is possible to get faster rates of convergence than 𝑂(𝐾^{−2(𝛼−1)/𝛼}), when the stochastic gradients have finite 𝛼-th moment, 𝛼∈(1,2]. In particular, our analysis allows the noise norm to have an unbounded expectation. To achieve these results, we stabilize stochastic gradients, ...

Added: April 22, 2024

Dvurechensky P., Dvinskikh D., Gasnikov A. et al., , in : Advances in Neural Information Processing Systems 31 (NeurIPS 2018). : Neural Information Processing Systems Foundation, 2018. P. 10760-10770.

We study the decentralized distributed computation of discrete approximations for the regularized Wasserstein barycenter of a finite set of continuous probability measures distributedly stored over a network. We assume there is a network of agents/machines/computers, and each agent holds a private continuous probability measure and seeks to compute the barycenter of all the measures in ...

Added: October 31, 2020

Bogachev T., / Cornell University. Series math "arxiv.org". 2022.

In this research a continuous model for resource allocations in a queuing system is considered and a local prediction on the system behavior is developed. As a result we obtain a set of possible cases, some of which lead to quite clear optimization problems. Currently, the main result of this research direction is an algorithm ...

Added: October 21, 2022

Kovalev D., Shulgin E., Richtarik P. et al., PMLR, 2021

We propose ADOM – an accelerated method for smooth and strongly convex decentralized optimization over time-varying networks. ADOM uses a dual oracle, i.e., we assume access to the gradient of the Fenchel conjugate of the individual loss functions. Up to a constant factor, which depends on the network structure only, its communication complexity is the ...

Added: October 31, 2021

Eduard Gorbunov, Kovalev D., Makarenko D. et al., , in : Advances in Neural Information Processing Systems 33 (NeurIPS 2020). : Curran Associates, Inc., 2020. P. 20889-20900.

Added: December 7, 2020

Vorontsova E., Gasnikov A., Dvurechensky P. et al., Automation and Remote Control 2019 Vol. 80 No. 8 P. 1487-1501

We propose an accelerated gradient-free method with a non-Euclidean proximal operator associated with the p-norm (1 ⩽ p ⩽ 2). We obtain estimates for the rate of convergence of the method under low noise arising in the calculation of the function value. We present the results of computational experiments. ...

Added: December 10, 2019

Dvurechensky P., Ostroukhov P., Safin K. et al., , in : International Conference on Machine Learning (ICML 2020). Vol. 119.: PMLR, 2020.

Projection-free optimization via different variants of the Frank-Wolfe (FW), a.k.a. Conditional Gradient method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems ...

Added: October 31, 2020

Ivanova A., Dvurechensky P., Vorontsova E. et al., Journal of Optimization Theory and Applications 2022 Vol. 193 No. 1-3 P. 462-490

Many convex optimization problems have structured objective functions written as a sum of functions with different oracle types (e.g., full gradient, coordinate derivative, stochastic gradient) and different arithmetic operations complexity of these oracles. In the strongly convex case, these functions also have different condition numbers that eventually define the iteration complexity of first-order methods and ...

Added: October 28, 2022

Osokin A., Alayrac J., Lukasewitz I. et al., , in : Proceedings of Machine Learning Research. Proceedings of the International Conference on Machine Learning (ICML 2016). Vol. 48.: NY : [б.и.], 2016. P. 885-925.

In this paper, we propose several improvements on the block-coordinate Frank-Wolfe (BCFW) algorithm from Lacoste-Julien et al. (2013) recently used to optimize the structured support vector machine (SSVM) objective in the context of structured prediction, though it has wider applications. The key intuition behind our improvements is that the estimates of block gaps maintained by ...

Added: October 19, 2017

Protasov V., Jungers R., Linear Algebra and its Applications 2013 Vol. 438 No. 11 P. 4448-4468

We introduce a new approach to evaluate the largest Lyapunov exponent of a family of nonnegative matrices. The method is based on using special positive homogeneous functionals on , which gives iterative lower and upper bounds for the Lyapunov exponent. They improve previously known bounds and converge to the real value. The rate of convergence ...

Added: February 23, 2016

Якушев В. Л., Симбиркин В. Н., Филимонов А. В. et al., Вестник Нижегородского университета им. Н.И. Лобачевского 2012 № 4(1) С. 238-246

The results of numerical experiments using the iterative methods are presented for the solution of ill-conditioned symmetric SLAE for a number of structural mechanics problems. The second order incomplete triangular factorization has been used to construct preconditionings. MPI and TBB technologies have been exploited to parallelize computations. Numerical experiments in different parallel regimes have been ...

Added: February 17, 2017

Kovalev D., Eduard Gorbunov, Gasanov E. et al., , in : Advances in Neural Information Processing Systems 31 (NeurIPS 2018). : Neural Information Processing Systems Foundation, 2018. P. 3358-3367.

The state-of-the-art methods for solving optimization problems in big dimensions are variants of randomized coordinate descent (RCD). In this paper we introduce a fundamentally new type of acceleration strategy for RCD based on the augmentation of the set of coordinate directions by a few spectral or conjugate directions. As we increase the number of extra ...

Added: December 7, 2020

Gasnikov A., Dvurechensky P., Gorbunov E. et al., , in : Conference on Learning Theory, 25-28 June 2019, Phoenix, USA. Vol. 99.: [б.и.], 2019. P. 1374-1391.

We consider convex optimization problems with the objective function having Lipshitz-continuous p-th order derivative, where p≥1. We propose a new tensor method, which closes the gap between the lower Ω( ε^(−2/(3p+1)) and upper O( ε^(−1/p+1)) iteration complexity bounds for this class of optimization problems. We also consider uniformly convex functions, and show how the proposed ...

Added: October 31, 2020

Garces A., Azhmyakov V., IFAC-PapersOnLine 2020 Vol. 53 No. 2 P. 13173-13177

This paper deals with an application of the nested convex programming to the optimal power flow (OPF) in multi-terminal high-voltage direct-current grids (MT-HVDC). The real-world optimization problem under consideration is non-convex. This fact implies some possible inconsistencies of the conventional numerical minimization algorithms (such as interior point method). Moreover, the constructive numerical treatment of this ...

Added: October 30, 2021

Dvinskikh D., Tiapkin D., , in : Proceedings of Machine Learning Research Volume 130: International Conference on Artificial Intelligence and Statistics. : [б.и.], 2021. P. 1738-1746.

In this paper, we focus on computational aspects of the Wasserstein barycenter problem. We propose two algorithms to compute Wasserstein barycenters of discrete measures. The first algorithm, based on mirror prox with a specific norm, meets the complexity of celebrated accelerated iterative Bregman projections (IBP), however, with no limitations in contrast to the (accelerated) IBP, which is ...

Added: November 2, 2022

Kamzolov D., Dvurechensky P., Gasnikov A., Optimization Methods and Software 2021 Vol. 36 No. 6 P. 1289-1316

In this paper, we propose new first-order methods for minimization of a convex function on a simple convex set. We assume that the objective function is a composite function given as a sum of a simple convex function and a convex function with inexact Hölder-continuous subgradient. We propose Universal Intermediate Gradient Method. Our method enjoys ...

Added: August 4, 2020

Ingacheva A., Kokhan V., Osipov D., , in : Proceedings of the 32nd European Conference on Modelling and Simulation (ECMS 2018),Wilhelmshaven, Germany 22 – 25 May 2018. : NY : Curran Associates, Inc., 2018. P. 183-189.

In this paper we consider the task of inner objects mapping for the building with a bunch of moving around it autonomous agents which use narrow beam of radio waves using WiFi frequency (2.4 GHz). Linear model of pixel-wise radio waves attenuation is considered. SIRT algorithm with TV and Tikhonov regularizations is used for the ...

Added: April 6, 2019

Kornilov N., Shamir O., Lobanov A. et al., , in : Advances in Neural Information Processing Systems 36 (NeurIPS 2023). : Curran Associates, Inc., 2023.

In this paper, we consider non-smooth stochastic convex optimization with two function evaluations per round under infinite noise variance. In the classical setting when noise has finite variance, an optimal algorithm, built upon the batched accelerated gradient method, was proposed in (Gasnikov et. al., 2022). This optimality is defined in terms of iteration and oracle ...

Added: March 26, 2024

Moulines E., Pereyra M., Durmus A., SIAM Journal on Imaging Sciences 2018 Vol. 11 No. 1 P. 473-506

Modern imaging methods rely strongly on Bayesian inference techniques to solve challenging imaging problems. Currently, the predominant Bayesian computation approach is convex optimization, which scales very efficiently to high-dimensional image models and delivers accurate point estimation results. However, in order to perform more complex analyses, for example, image uncertainty quantification or model selection, it is ...

Added: December 11, 2018

Гурарий М. М., Жаров М. М., Русаков С. Г. et al., Информационные технологии 2018 Т. 24 № 7 С. 435-444

The directions of improvement of minimax methods for circuit design problems are considered. The choices of generalized quality criterion for the circuit design is discussed. It is concluded that the minimax criterion has advantages over other formulations of design targets. New approach to setting of individual objectives for each performance indicator is proposed. The approach ...

Added: February 12, 2019

Zlotnik A., Čiegis R., Journal of Scientific Computing 2023 Vol. 95 No. 1 Article 3

We consider an initial-boundary value problem for the $n$-dimensional wave equation with the variable sound speed, $n\geq 1$. We construct three-level implicit in time and compact in space (three-point in each space direction) 4th order finite-difference schemes on the uniform rectangular meshes including their one-parameter (for $n=2$) and three-parameter (for $n=3$) families. We also show that some ...

Added: January 20, 2023

Ivanova A., Gasnikov A., Dvurechensky P. et al., / Working papers by Cornell University. Series "Optimization and Control". 2020.

Ubiquitous in machine learning regularized empirical risk minimization problems are often composed of several blocks which can be treated using different types of oracles, e.g., full gradient, stochastic gradient or coordinate derivative. Optimal oracle complexity is known and achievable separately for the full gradient case, the stochastic gradient case, etc. We propose a generic framework ...

Added: October 25, 2020

Eduard Gorbunov, Bibi A., Sener O. et al., , in : Proceedings of the 8th International Conference on Learning Representations (ICLR 2020). : ICLR, 2020. P. 1-28.

We consider the problem of unconstrained minimization of a smooth objective function in $\mathbb{R}^d$ in setting where only function evaluations are possible. We propose and analyze stochastic zeroth-order method with heavy ball momentum. In particular, we propose, SMTP, a momentum version of the stochastic three-point method (STP) Bergou et al. (2019). We show new complexity ...

Added: December 7, 2020