?
On a Combination of Alternating Minimization and Nesterov’s Momentum
P. 3886-3898.
Language:
English
In book
Vol. 139. , PMLR, 2021
Tupitsa N., Dvurechensky P., Gasnikov A. et al., Journal of Inverse and Ill-posed problems 2021 Vol. 29 No. 5 P. 721-739
We consider alternating minimization procedures for convex and non-convex optimization problems with the vector of variables divided into several blocks, each block being amenable for minimization with respect to its variables while maintaining other variables' blocks constant. In the case of two blocks, we prove a linear convergence rate for alternating minimization procedure under the ...
Added: September 29, 2021
Tupitsa N., Gasnikov A., Dvurechensky P. et al., , in : Mathematical Optimization Theory and Operations Research. MOTOR 2020. Communications in Computer and Information Science. Vol. 1275.: Springer, 2020. P. 192-204.
In this paper we experimentally check a hypothesis, that dual problem to discrete entropy regularized optimal transport problem possesses strong convexity on a certain compact set. We present a numerical estimation technique of parameter of strong convexity and show that such an estimate increases the performance of an accelerated alternating minimization algorithm for strongly convex ...
Added: October 28, 2020
Beznosikov A., Richtarik P., Diskin M. et al., , in : Thirty-Sixth Conference on Neural Information Processing Systems : NeurIPS 2022. : Curran Associates, Inc., 2022. P. 14013-14029.
Added: January 27, 2023
Ivanova A., Пасечнюк Д., Dvurechensky P. et al., / Cornell University. Серия "Working papers by Cornell University". 2019.
In this paper, we consider the resource allocation problem in a network with a large number of connections which are used by a huge number of users. The resource allocation problem, which we consider is a maximization problem with linear inequality constraints. To solve this problem we construct the dual problem and propose to use ...
Added: October 23, 2020
Kovalev D., Eduard Gorbunov, Gasanov E. et al., , in : Advances in Neural Information Processing Systems 31 (NeurIPS 2018). : Neural Information Processing Systems Foundation, 2018. P. 3358-3367.
The state-of-the-art methods for solving optimization problems in big dimensions are variants of randomized coordinate descent (RCD). In this paper we introduce a fundamentally new type of acceleration strategy for RCD based on the augmentation of the set of coordinate directions by a few spectral or conjugate directions. As we increase the number of extra ...
Added: December 7, 2020
Eduard Gorbunov, Bibi A., Sener O. et al., , in : Proceedings of the 8th International Conference on Learning Representations (ICLR 2020). : ICLR, 2020. P. 1-28.
We consider the problem of unconstrained minimization of a smooth objective function in $\mathbb{R}^d$ in setting where only function evaluations are possible. We propose and analyze stochastic zeroth-order method with heavy ball momentum. In particular, we propose, SMTP, a momentum version of the stochastic three-point method (STP) Bergou et al. (2019). We show new complexity ...
Added: December 7, 2020
Dvurechensky P., Ostroukhov P., Safin K. et al., , in : International Conference on Machine Learning (ICML 2020). Vol. 119.: PMLR, 2020.
Projection-free optimization via different variants of the Frank-Wolfe (FW), a.k.a. Conditional Gradient method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems ...
Added: October 31, 2020
Dvinskikh D., Tiapkin D., , in : Proceedings of Machine Learning Research Volume 130: International Conference on Artificial Intelligence and Statistics. : [б.и.], 2021. P. 1738-1746.
In this paper, we focus on computational aspects of the Wasserstein barycenter problem. We propose two algorithms to compute Wasserstein barycenters of discrete measures. The first algorithm, based on mirror prox with a specific norm, meets the complexity of celebrated accelerated iterative Bregman projections (IBP), however, with no limitations in contrast to the (accelerated) IBP, which is ...
Added: November 2, 2022
Dvinskikh D., Gorbunov E., Gasnikov A. et al., , in : 2019 IEEE 58th Conference on Decision and Control (CDC). : IEEE, 2019. P. 7435-7440.
We introduce primal and dual stochastic gradient oracle methods for distributed convex optimization problems over networks. We show that the proposed methods are optimal (in terms of communication steps) for primal and dual oracles. Additionally, for a dual stochastic oracle, we propose a new analysis method for the rate of convergence in terms of duality ...
Added: February 5, 2021
Bogachev T., / Cornell University. Series math "arxiv.org". 2022.
In this research a continuous model for resource allocations in a queuing system is considered and a local prediction on the system behavior is developed. As a result we obtain a set of possible cases, some of which lead to quite clear optimization problems. Currently, the main result of this research direction is an algorithm ...
Added: October 21, 2022
Andresen A., Spokoiny V., Journal of Machine Learning Research 2016 No. 17(63) P. 1-53
We derive two convergence results for a sequential alternating maximization procedure to approximate the maximizer of random functionals such as the realized log likelihood in MLE estimation. We manage to show that the sequence attains the same deviation properties as shown for the profile M-estimator by Andresen and Spokoiny (2013), that means a finite sample ...
Added: September 8, 2016
Kovalev D., Shulgin E., Richtarik P. et al., PMLR, 2021
We propose ADOM – an accelerated method for smooth and strongly convex decentralized optimization over time-varying networks. ADOM uses a dual oracle, i.e., we assume access to the gradient of the Fenchel conjugate of the individual loss functions. Up to a constant factor, which depends on the network structure only, its communication complexity is the ...
Added: October 31, 2021
Gasnikov A., Dvurechensky P., Gorbunov E. et al., , in : Conference on Learning Theory, 25-28 June 2019, Phoenix, USA. Vol. 99.: [б.и.], 2019. P. 1374-1391.
We consider convex optimization problems with the objective function having Lipshitz-continuous p-th order derivative, where p≥1. We propose a new tensor method, which closes the gap between the lower Ω( ε^(−2/(3p+1)) and upper O( ε^(−1/p+1)) iteration complexity bounds for this class of optimization problems. We also consider uniformly convex functions, and show how the proposed ...
Added: October 31, 2020
Eduard Gorbunov, Kovalev D., Makarenko D. et al., , in : Advances in Neural Information Processing Systems 33 (NeurIPS 2020). : Curran Associates, Inc., 2020. P. 20889-20900.
Added: December 7, 2020
Garces A., Azhmyakov V., IFAC-PapersOnLine 2020 Vol. 53 No. 2 P. 13173-13177
This paper deals with an application of the nested convex programming to the optimal power flow (OPF) in multi-terminal high-voltage direct-current grids (MT-HVDC). The real-world optimization problem under consideration is non-convex. This fact implies some possible inconsistencies of the conventional numerical minimization algorithms (such as interior point method). Moreover, the constructive numerical treatment of this ...
Added: October 30, 2021
Vorontsova E., Gasnikov A., Dvurechensky P. et al., Automation and Remote Control 2019 Vol. 80 No. 8 P. 1487-1501
We propose an accelerated gradient-free method with a non-Euclidean proximal operator associated with the p-norm (1 ⩽ p ⩽ 2). We obtain estimates for the rate of convergence of the method under low noise arising in the calculation of the function value. We present the results of computational experiments. ...
Added: December 10, 2019
Kamzolov D., Dvurechensky P., Gasnikov A., Optimization Methods and Software 2021 Vol. 36 No. 6 P. 1289-1316
In this paper, we propose new first-order methods for minimization of a convex function on a simple convex set. We assume that the objective function is a composite function given as a sum of a simple convex function and a convex function with inexact Hölder-continuous subgradient. We propose Universal Intermediate Gradient Method. Our method enjoys ...
Added: August 4, 2020
Ingacheva A., Kokhan V., Osipov D., , in : Proceedings of the 32nd European Conference on Modelling and Simulation (ECMS 2018),Wilhelmshaven, Germany 22 – 25 May 2018. : NY : Curran Associates, Inc., 2018. P. 183-189.
In this paper we consider the task of inner objects mapping for the building with a bunch of moving around it autonomous agents which use narrow beam of radio waves using WiFi frequency (2.4 GHz). Linear model of pixel-wise radio waves attenuation is considered. SIRT algorithm with TV and Tikhonov regularizations is used for the ...
Added: April 6, 2019
Kornilov N., Shamir O., Lobanov A. et al., , in : Advances in Neural Information Processing Systems 36 (NeurIPS 2023). : Curran Associates, Inc., 2023.
In this paper, we consider non-smooth stochastic convex optimization with two function evaluations per round under infinite noise variance. In the classical setting when noise has finite variance, an optimal algorithm, built upon the batched accelerated gradient method, was proposed in (Gasnikov et. al., 2022). This optimality is defined in terms of iteration and oracle ...
Added: March 26, 2024
Moulines E., Pereyra M., Durmus A., SIAM Journal on Imaging Sciences 2018 Vol. 11 No. 1 P. 473-506
Modern imaging methods rely strongly on Bayesian inference techniques to solve challenging imaging problems. Currently, the predominant Bayesian computation approach is convex optimization, which scales very efficiently to high-dimensional image models and delivers accurate point estimation results. However, in order to perform more complex analyses, for example, image uncertainty quantification or model selection, it is ...
Added: December 11, 2018
Ivanova A., Dvurechensky P., Vorontsova E. et al., Journal of Optimization Theory and Applications 2022 Vol. 193 No. 1-3 P. 462-490
Many convex optimization problems have structured objective functions written as a sum of functions with different oracle types (e.g., full gradient, coordinate derivative, stochastic gradient) and different arithmetic operations complexity of these oracles. In the strongly convex case, these functions also have different condition numbers that eventually define the iteration complexity of first-order methods and ...
Added: October 28, 2022
Osokin A., Alayrac J., Lukasewitz I. et al., , in : Proceedings of Machine Learning Research. Proceedings of the International Conference on Machine Learning (ICML 2016). Vol. 48.: NY : [б.и.], 2016. P. 885-925.
In this paper, we propose several improvements on the block-coordinate Frank-Wolfe (BCFW) algorithm from Lacoste-Julien et al. (2013) recently used to optimize the structured support vector machine (SSVM) objective in the context of structured prediction, though it has wider applications. The key intuition behind our improvements is that the estimates of block gaps maintained by ...
Added: October 19, 2017
Protasov V., Jungers R., Linear Algebra and its Applications 2013 Vol. 438 No. 11 P. 4448-4468
We introduce a new approach to evaluate the largest Lyapunov exponent of a family of nonnegative matrices. The method is based on using special positive homogeneous functionals on , which gives iterative lower and upper bounds for the Lyapunov exponent. They improve previously known bounds and converge to the real value. The rate of convergence ...
Added: February 23, 2016
Dvurechensky P., Dvinskikh D., Gasnikov A. et al., , in : Advances in Neural Information Processing Systems 31 (NeurIPS 2018). : Neural Information Processing Systems Foundation, 2018. P. 10760-10770.
We study the decentralized distributed computation of discrete approximations for the regularized Wasserstein barycenter of a finite set of continuous probability measures distributedly stored over a network. We assume there is a network of agents/machines/computers, and each agent holds a private continuous probability measure and seeks to compute the barycenter of all the measures in ...
Added: October 31, 2020