In many applications, the real high-dimensional data occupy only a very small part in the high dimensional ‘observation space’ whose intrinsic dimension is small. The most popular model of such data is Manifold model which assumes that the data lie on or near an unknown manifold (Data Manifold, DM) of lower dimensionality embedded in an ambient high-dimensional input space (Manifold Assumption about high-dimensional data). Manifold Learning is a Dimensionality Reduction problem under the Manifold assumption about the processed data, and its goal is to construct a low-dimensional parameterization of the DM (global low-dimensional coordinates on the DM) from a finite dataset sampled from the DM.
Manifold Assumption means that local neighborhood of each manifold point is equivalent to an area of low-dimensional Euclidean space. Because of this, most of Manifold Learning algorithms include two parts: ‘local part’ in which certain characteristics reflecting low-dimensional local structure of neighborhoods of all sample points are constructed via nonparametric estimation, and ‘global part’ in which global low-dimensional coordinates on the DM are constructed by solving the certain convex optimization problem for specific cost function depending on the local characteristics. Both statistical properties of ‘local part’ and its average over manifold are considered in the paper. The article is an extension of the paper (Yanovich, 2016) for the case of nonparametric estimation.