Book chapter
The normalized algorithmic information distance can not be approximated
It is known that the normalized algorithmic information distance is not computable and not semicomputable. We show that for all 𝜀<1/2, there exist no semicomputable functions that differ from N by at most 𝜀. Moreover, for any computable function f such that |lim𝑡𝑓(𝑥,𝑦,𝑡)−N(𝑥,𝑦)|≤𝜀 and for all n, there exist strings x, y of length n such that
∑_𝑡 |𝑓(𝑥,𝑦,𝑡+1)−𝑓(𝑥,𝑦,𝑡)| ≥ 𝛺(log 𝑛)
This is optimal up to constant factors.
We also show that the maximal number of oscillations of a limit approximation of N is 𝛺(𝑛/log𝑛). This strengthens the 𝜔(1) lower bound from [K. Ambos-Spies, W. Merkle, and S.A. Terwijn, 2019, Normalized information distance and the oscillation hierarchy].
Antistochastic strings are those strings that do not have any reasonable statistical explanation. We establish the follow property of such strings: every antistochastic string x is “holographic” in the sense that it can be restored by a short program from any of its part whose length equals the Kolmogorov complexity of x. Further we will show how it can be used for list decoding from erasing and prove that Symmetry of Information fails for total conditional complexity.
We present a new structural lemma for deterministic con- text free languages. From the first sight, it looks like a pumping lemma, because it is also based on iteration properties, but it has significant distinctions that makes it much easier to apply. The structural lemma is a combinatorial analogue of KC-DCF-Lemma (based on Kolmogorov complexity), presented by Li and Vit ́anyi in 1995 and corrected by Glier in 2003. The structural lemma allows not only to prove that a language is not a DCFL, but discloses the structure of DCFLs Myhill-Nerode classes.
Algorithmic statistics has two different (and almost orthogonal) motivations. From the philosophical point of view, it tries to formalize how the statistics works and why some statistical models are better than others. After this notion of a "good model" is introduced, a natural question arises: it is possible that for some piece of data there is no good model? If yes, how often these bad ("non-stochastic") data appear "in real life"? Another, more technical motivation comes from algorithmic information theory. In this theory a notion of complexity of a finite object (=amount of information in this object) is introduced; it assigns to every object some number, called its algorithmic complexity (or Kolmogorov complexity). Algorithmic statistic provides a more fine-grained classification: for each finite object some curve is defined that characterizes its behavior. It turns out that several different definitions give (approximately) the same curve. In this survey we try to provide an exposition of the main results in the field (including full proofs for the most important ones), as well as some historical comments. We assume that the reader is familiar with the main notions of algorithmic information (Kolmogorov complexity) theory.
The paper [Harry Buhrman, Michal Kouck ́, Nikolay Vereshcha- y gin. Randomized Individual Communication Complexity. IEEE Con- ference on Computational Complexity 2008: 321-331] considered com- munication complexity of the following problem. Alice has a bi- nary string x and Bob a binary string y, both of length n, and they want to compute or approximate Kolmogorov complexity C(x|y) of x conditional to y. It is easy to show that deterministic communica- tion complexity of approximating C(x|y) with precision α is at least n − 2α − O(1). The above referenced paper asks what is random- ized communication complexity of this problem and shows that for r- round randomized protocols its communication complexity is at least Ω((n/α)1/r ). In this paper, for some positive ε, we show the lower bound 0.99n for (worst case) communication length of any random- ized protocol that with probability at least 0.01 approximates C(x|y) with precision εn for all input pairs.