Optimal learning via local entropies and sample compression

?

Optimal learning via local entropies and sample compression

Proceedings of Machine Learning Research. 2017. Vol. 65. P. 2023–2065.

Zhivotovskiy N.

Under margin assumptions, we prove several risk bounds, represented via the distribution dependent local entropies of the classes or the sizes of specific sample compression schemes. In some cases, our guarantees are optimal up to constant factors for families of classes. We discuss limitations of our approach and give several applications. In particular, we provide a new tight PAC bound for the hard-margin SVM, an extended analysis of certain empirical risk minimizers under log-concave distributions, a new variant of an online to batch conversion, and distribution dependent localized bounds in the aggregation framework. As a part of our results, we give a new upper bound for the uniform deviations under Bernstein assumptions, which may be of independent interest. The proofs for the sample compression schemes are based on the moment method combined with the analysis of voting algorithms.

Priority areas: IT and mathematics mathematics

Language: English

Text on another site

Keywords: Statistical learning theory

Bifurcations and Structural Stability of Generic PC-HC Families

Dorovskiy A., / Series arXiv "math". 2026.

In this paper the structural stability of generic families of vector fields of the PC-HC class on the two-dimensional sphere is proved. A classification of these families up to moderate equivalence in neighborhoods of their large bifurcation supports is presented, based on such invariants as the configuration and the characteristic set. The realization lemma is proved. ...

Added: May 14, 2026

On the minimum number of maximal distance-k independent sets in trees

Taletskii D., / Series arXiv "math". 2026.

A vertex subset of a graph is called a \textit{distance-$k$ independent set} if the distance between any two of its distinct vertices is at least $k + 1$. For all $n,k \geq 1$, we determine the minimum possible number of inclusion-wise maximal distance-$k$ independent sets among all $n$-vertex trees. It equals~$n$ if $n \leq k ...

Added: May 1, 2026

On Arithmetic Mirror Symmetry for smooth Fano fourfolds

Ovcharenko M., / Series arXiv "math". 2026.

We introduce an explicit class of tempered Laurent polynomials in the sense of Villegas and Doran--Kerr in n⩽4 variables including all Landau--Ginzburg models for smooth Fano threefolds with very ample anticanonical class. We check that it contains Landau--Ginzburg models for various Fano fourfolds which are complete intersections in smooth toric varieties and Grassmannians of planes, ...

Added: April 30, 2026

Natural hazard database from Internet publications: text mining with a large language model

Derkacheva A., Sakirkina M., Kraev G. et al., /. 2026.

Comprehensive data on natural hazards and their consequences are crucial for effective for risk assessment, adaptation planning, and emergency response. However, many countries face challenges with fragmented, inconsistent, and inaccessible data, particularly regarding local-scale events. To address this data gap in Russia, we developed an end-to-end processing pipeline that scrapes news from various online sources, ...

Added: April 28, 2026

Ising models on the hydrogen peroxide and other lattices

Qin X., Deng Y., Shchur L. et al., / Series arXiv "math". 2026. No. 2603.02962.

We perform a Monte Carlo analysis of the Ising model on many three-dimensional lattices. By means of finite-size scaling we obtain the critical points and determine the scaling dimensions. As expected, the critical exponents agree with the three-dimensional Ising universality class for all models. The irrelevant field, as revealed by the correction-to-scaling amplitudes, appears to ...

Added: April 20, 2026

Algorithmic overlaps as thermodynamic variables: from local to cluster Monte Carlo dynamics in critical phenomena

Pilé I., Deng Y., Shchur L., / Series arXiv "math". 2026. No. 2604.10254.

We investigate the spatial overlap of successive spin configurations in Markov chain Monte Carlo simulations using the local Metropolis algorithm and the Svendsen-Wang and Wolff cluster algorithms. We examine the dynamics of these algorithms for two models in different universality classes: the Ising model and the Potts model with three components. The overlap of two ...

Added: April 20, 2026

On weak solutions to the 1d compressible Navier-Stokes equations: a Lipschitz continuous dependence on data in weaker norms and an error of their homogenization

Zlotnik Alexander, / Series arXiv "math". 2026. No. 2602.03481v1.

We deal with the global in time weak solutions to the 1D compressible Navier-Stokes system of equations for large discontinuous initial data and nonhomogeneous boundary conditions of three standard types. We prove the Lipschitz-type continuous dependence of the solution $(\eta,u,\theta)$, in a norm slightly stronger than $L^{2,\infty}(Q)\times L^2(Q)\times L^2(Q)$, on the initial data $(\eta^0,u^0,e^0)$ in a ...

Added: April 18, 2026

On the dimension of the space of static potentials on three-manifolds

Medvedev V., / Series arXiv "math". 2026.

We investigate the interplay between the dimension of the space of static potentials and the geometric and topological structure of the underlying static three-manifold. A partial classification of boundaryless static manifolds is obtained in terms of this dimension. We also treat the case of static manifolds with boundary. In particular, we prove that if a ...

Added: April 3, 2026

Using predefined vector systems to speed up neural network multimillion class classification

Gabdullin N., Androsov I., / Series Computer Science "arxiv.org". 2026.

Label prediction in neural networks (NNs) has O(n) complexity proportional to the number of classes. This holds true for classification using fully connected layers and cosine similarity with some set of class prototypes. In this paper we show that if NN latent space (LS) geometry is known and possesses specific properties, label prediction complexity can ...

Added: April 2, 2026

Homogeneous maximizers of the Blaschke-Santalo-type functionals

Kolesnikov A., / Series arXiv "math". 2025.

We study Blaschke--Santal{ó}-type inequalities for N>=2 sets (functions) and a special class of cost functions. In particular, we prove new results about reduction of the maximization problem for the Blaschke--Santal{ó}-type functional to homogeneous case (functional inequalities on the sphere) and extend the symmetrization argument to the case of N>2 sets. We also discuss links to the ...

Added: February 13, 2026

Iterative Ricci-Foster Curvature Flow with GMM-Based Edge Pruning: A Novel Approach to Community Detection

Sorokin K., Beketov M., Онучин А. et al., / arxiv.org. Серия cs.SI "Social and Information Networks ". 2025.

Community detection in complex networks is a fundamental problem, open to new approaches in various scientific settings. We introduce a novel community detection method, based on Ricci flow on graphs. Our technique iteratively updates edge weights (their metric lengths) according to their (combinatorial) Foster version of Ricci curvature computed from effective resistance distance between the ...

Added: January 15, 2026

A Contrastive Approach to Online Change Point Detection

Puchkin N., Shcherbakova V., , in: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023), Volume 206Vol. 206.: Valencia: PMLR, 2023. P. 5686–5713.

We suggest a novel procedure for online change point detection. Our approach expands an idea of maximizing a discrepancy measure between points from pre-change and post-change distributions. This leads to a flexible procedure suitable for both parametric and nonparametric scenarios. We prove non-asymptotic bounds on the average running length of the procedure and its expected ...

Added: August 2, 2023

Дидактика и педагогический дизайн: что общего и что особенного?

Chernobay E., Koreshnikova Y., Отечественная и зарубежная педагогика 2021 Т. 1 № 5 С. 177–190

One of the key characteristics of the modern world is high volatility. Accelerating changes cannot but affect the education system, requiring it to constantly improve itself, including the development of new solutions, proposals and approaches. As a result, in recent years, in foreign educational science there is a reassessment of the importance of such an approach as ...

Added: October 31, 2021

Conference on Learning Theory, 25-28 June 2019, Phoenix, USA

[б.и.], 2019.

Volume 99: Conference on Learning Theory, 25-28 June 2019, Phoenix, USA ...

Added: October 31, 2020

Localization of VC classes: Beyond local Rademacher complexities

Zhivotovskiy N., Hanneke S., Theoretical Computer Science 2018 Vol. 9 No. 742 P. 27–49

In statistical learning the excess risk of empirical risk minimization (ERM) is controlled by (COMPn(F)n)α, where n is a size of a learning sample, COMPn(F) is a complexity term associated with a given class F and α∈[12,1] interpolates between slow and fast learning rates. In this paper we introduce an alternative localization approach for binary classificationthat leads to a novel complexity measure: fixed points of the local empirical entropy. We show that this ...

Added: December 6, 2018