### Article

## Algorithmic Statistics: Forty Years Later.

Algorithmic statistics has two different (and almost orthogonal) motivations. From the philosophical point of view, it tries to formalize how the statistics works and why some statistical models are better than others. After this notion of a "good model" is introduced, a natural question arises: it is possible that for some piece of data there is no good model? If yes, how often these bad ("non-stochastic") data appear "in real life"? Another, more technical motivation comes from algorithmic information theory. In this theory a notion of complexity of a finite object (=amount of information in this object) is introduced; it assigns to every object some number, called its algorithmic complexity (or Kolmogorov complexity). Algorithmic statistic provides a more fine-grained classification: for each finite object some curve is defined that characterizes its behavior. It turns out that several different definitions give (approximately) the same curve. In this survey we try to provide an exposition of the main results in the field (including full proofs for the most important ones), as well as some historical comments. We assume that the reader is familiar with the main notions of algorithmic information (Kolmogorov complexity) theory.

A survey of main results in algorithmic statistics

We present a new structural lemma for deterministic con- text free languages. From the first sight, it looks like a pumping lemma, because it is also based on iteration properties, but it has significant distinctions that makes it much easier to apply. The structural lemma is a combinatorial analogue of KC-DCF-Lemma (based on Kolmogorov complexity), presented by Li and Vit ́anyi in 1995 and corrected by Glier in 2003. The structural lemma allows not only to prove that a language is not a DCFL, but discloses the structure of DCFLs Myhill-Nerode classes.

Joseph Miller [16] and independently Andre Nies, Frank Stephan and Sebastiaan Terwijn [18] gave a complexity characterization of 2-random sequences in terms of plain Kolmogorov complexity C: they are sequences that have infinitely many initial segments with O(1)-maximal plain complexity (among the strings of the same length). Later Miller [17] showed that prefix complexity K can also be used in a similar way: a sequence is 2-random if and only if it has infinitely many initial segments with O(1)-maximal prefix complexity (which is n + K (n) for strings of length n). The known proofs of these results are quite involved; in this paper we provide simple direct proofs for both of them. In [16] Miller also gave a quantitative version of the first result: the 0'-randomness deficiency of a sequence {\omega} equals lim inf [n - C ({\omega}1 . . . {\omega}n)] + O(1). (Our simplified proof can also be used to prove this.) We show (and this seems to be a new result) that a similar quantitative result is also true for prefix complexity: 0'-randomness deficiency equals lim inf [n + K (n) -- K ({\omega}1 . . . {\omega}n)] + O(1). Prefix and plain Kolmogorov complexity characterizations of 2-randomness: simple proofs (PDF Download Available). Available from: http://www.researchgate.net/publication/258082399_Prefix_and_plain_Kolmogorov_complexity_characterizations_of_2-randomness_simple_proofs [accessed Oct 20, 2015].

The paper [Harry Buhrman, Michal Kouck ́, Nikolay Vereshcha- y gin. Randomized Individual Communication Complexity. IEEE Con- ference on Computational Complexity 2008: 321-331] considered com- munication complexity of the following problem. Alice has a bi- nary string x and Bob a binary string y, both of length n, and they want to compute or approximate Kolmogorov complexity C(x|y) of x conditional to y. It is easy to show that deterministic communica- tion complexity of approximating C(x|y) with precision α is at least n − 2α − O(1). The above referenced paper asks what is random- ized communication complexity of this problem and shows that for r- round randomized protocols its communication complexity is at least Ω((n/α)1/r ). In this paper, for some positive ε, we show the lower bound 0.99n for (worst case) communication length of any random- ized protocol that with probability at least 0.01 approximates C(x|y) with precision εn for all input pairs.