### Book

## Proceedings of Machine Learning Research

Volume 80 is assigned to the 2018 International Conference on Machine Learning (ICML 2018)

We analyze two algorithms for approximating the general optimal transport (OT) distance between two discrete distributions of size $n$, up to accuracy $\varepsilon$. For the first algorithm, which is based on the celebrated Sinkhorn’s algorithm, we prove the complexity bound $\widetilde{O}\left(\frac{n^2}{\varepsilon^2}\right)$ arithmetic operations ($\widetilde{O}$ hides polylogarithmic factors $(\ln n)^c$, $c>0$). For the second one, which is based on our novel Adaptive Primal-Dual Accelerated Gradient Descent (APDAGD) algorithm, we prove the complexity bound $\widetilde{O}\left(\min\left\{\frac{n^{9/4}}{\varepsilon}, \frac{n^{2}}{\varepsilon^2} \right\}\right)$ arithmetic operations. Both bounds have better dependence on $\varepsilon$ than the state-of-the-art result given by $\widetilde{O}\left(\frac{n^2}{\varepsilon^3}\right)$. Our second algorithm not only has better dependence on $\varepsilon$ in the complexity bound, but also is not specific to entropic regularization and can solve the OT problem with different regularizers.

We analyze two algorithms for approximating the general optimal transport (OT) distance between two discrete distributions of size $n$, up to accuracy $\varepsilon$. For the first algorithm, which is based on the celebrated Sinkhorn’s algorithm, we prove the complexity bound $\widetilde{O}\left(\frac{n^2}{\varepsilon^2}\right)$ arithmetic operations ($\widetilde{O}$ hides polylogarithmic factors $(\ln n)^c$, $c>0$). For the second one, which is based on our novel Adaptive Primal-Dual Accelerated Gradient Descent (APDAGD) algorithm, we prove the complexity bound $\widetilde{O}\left(\min\left\{\frac{n^{9/4}}{\varepsilon}, \frac{n^{2}}{\varepsilon^2} \right\}\right)$ arithmetic operations. Both bounds have better dependence on $\varepsilon$ than the state-of-the-art result given by $\widetilde{O}\left(\frac{n^2}{\varepsilon^3}\right)$. Our second algorithm not only has better dependence on $\varepsilon$ in the complexity bound, but also is not specific to entropic regularization and can solve the OT problem with different regularizers.

In this paper, we introduce a class of local indicators that enable us to compute efficiently optimal transport plans associated with arbitrary weighted distributions of N demands and M supplies in R in the case where the cost function is concave. Indeed, whereas this problem can be solved linearly when the cost is a convex function of the distance on the line (or more generally when the cost matrix between points is a Monge matrix), to the best of our knowledge no simple solution has been proposed for concave costs, which are more realistic in many applications, especially in economic situations. The problem we consider may be unbalanced, in the sense that the weight of all the supplies might be larger than the weight of all the demands. We show how to use the local indicators hierarchically to solve the transportation problem for concave costs on the line.

We consider the space P(X) of probability measures on arbitrary Radon space X endowed with a transportation cost J(μ, ν) generated by a nonnegative continuous cost function. For a probability distribution on P(X) we formulate a notion of average with respect to this transportation cost, called here the *Fréchet barycenter*, prove a version of the law of large numbers for Fréchet barycenters, and discuss the structure of P(X) related to the transportation cost J.

In this paper, we consider a large class of hierarchical congestion population games. One can show that the equilibrium in a game of such type can be described as a minimum point in a properly constructed multi-level convex optimization problem. We propose a fast primal-dual composite gradient method and apply it to the problem, which is dual to the problem describing the equilibrium in the considered class of games. We prove that this method allows to find an approximate solution of the initial problem without increasing the complexity.