Algorithmic Statistics, Prediction and Machine Learning

A. Milovanov

doi:10.4230/LIPIcs.STACS.2016.54

Publications

?

Algorithmic Statistics, Prediction and Machine Learning

Ch. 54. P. 1–13.

Milovanov A.

Algorithmic statistics considers the following problem: given a binary string

(e.g., some

experimental data), find a “good” explanation of this data. It uses algorithmic information

theory to define formally what is a good explanation. In this paper we extend this framework in

two directions.

First, the explanations are not only interesting in themselves but also used for prediction: we

want to know what kind of data we may reasonably expect in similar situations (repeating the

same experiment). We show that some kind of hierarchy can be constructed both in terms of

algorithmic statistics and using the notion of a priori probability, and these two approaches turn

out to be equivalent (Theorem 5).

Second, a more realistic approach that goes back to machine learning theory, assumes that

we have not a single data string

but some set of “positive examples”

,...,x

that all belong

to some unknown set

, a property that we want to learn. We want this set

to contain all

positive examples and to be as small and simple as possible. We show how algorithmic statistic

can be extended to cover this situation (Theorem 11)

Language: English

DOI

Text on another site

Keywords: prediction Kolmogorov complexity minimum description length

Publication based on the results of:

Теоретическая информатика (2016)

In book

33rd Symposium on Theoretical Aspects of Computer Science (STACS 2016) Leibniz International Proceedings in Informatics (LIPIcs)

Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, 2016.

Predictions and Algorithmic Statistics for Infinite Sequences

Milovanov A., , in: Computer Science – Theory and Applications: 16th International Computer Science Symposium in Russia, CSR 2021, Sochi, Russia, June 28–July 2, 2021, Proceedings.: Springer, 2021. Ch. 17 P. 283–295.

We combine Solomonoff’s approach to universal prediction with algorithmic statistics and suggest to use the computable measure that provides the best “explanation” for the observed data (in the sense of algorithmic statistics) for prediction. In this way we keep the expected sum of squares of prediction errors bounded (as it was for the Solomonoff’s predictor) ...

Added: August 11, 2021

Новые возможности применения методов искусственного интеллекта для моделирования появления и развития заболеваний и оптимизации их профилактики и лечения

Думлер А. А., Черепанов Ф. М., Терапия 2018 № 1(19) С. 109–118

This article is devoted to the methodological issues of the application of artificial intelligence techniques in preventive medicine. We showed a specific example of the neural network application allows not only to diagnose cardiovascular diseases, but also on a quantitative basis to predict their emergence and development in future periods of life. This allows you ...

Added: January 9, 2019

Complexity of complexity and maximal plain versus prefix-free Kolmogorov complexity

Bauwens B. F., Shen A., Journal of Symbolic Logic 2013 Vol. 79 No. 2 P. 620–632

Added: October 2, 2015

TECHNIQUES IN DEVELOPING LISTENING SKILLS WHEN TEACHING INTERPRETERS

Krivoshlykova L., Pushkina A., Ryabkova V., , in: EDULEARN21 Proceedings 13th International Conference on Education and New Learning Technologies July 5th-6th, 2021.: IATED, 2021. P. 1076–1081.

Listening training plays an important role in teaching interpreters. In this regard, educators and psychologists consider the listening stage in the process of translation to be a complex receptive mental and mnemonic activity, which is associated not only with the perception of speech messages, but also with their comprehension and analysis. The aim of the ...

Added: November 14, 2022

Algorithmic Statistics: Normal Objects and Universal Models

Milovanov A., , in: Computer Science – Theory and Applications. 11th International Computer Science Symposium in Russia, CSR 2016, St. Petersburg, Russia, June 9-13, 2016, ProceedingsVol. 9691: Lecture Notes in Computer Science.: Switzerland: Springer, 2016. P. 280–293.

In algorithmic statistics quality of a statistical hypothesis (a model) P for a data x is measured by two parameters: Kolmogorov complexity of the hypothesis and the probability P(x). A class of models SijSij that are the best at this point of view, were discovered. However these models are too abstract. To restrict the class ...

Added: June 27, 2016

The Role of Arousal and Valence in Predictions of Short Video Virality: A Psychophysiological Perspective

Shelepenkov D., Kosonogov V., Computers in Human Behavior 2025

Predicting the Virality of TikTok Videos: A Psychophysiological Study ...

Added: June 20, 2025

Sophistication vs Logical Depth

Antunes L., Bauwens B. F., Souto A. et al., Theory of Computing Systems 2017 Vol. 60 No. 2 P. 280–298

Sophistication and logical depth are two measures that express how complicated the structure in a string is. Sophistication is defined as the minimal complexity of a computable function that defines a two-part description for the string that is shortest within some precision; the second can be defined as the minimal computation time of a program ...

Added: May 4, 2016

Algorithmic Statistics: Forty Years Later.

Shen A., Vereshchagin N., , in: Computability and Complexity.: Berlin: Springer, 2017. P. 669–737.

Algorithmic statistics has two different (and almost orthogonal) motivations. From the philosophical point of view, it tries to formalize how the statistics works and why some statistical models are better than others. After this notion of a "good model" is introduced, a natural question arises: it is possible that for some piece of data there ...

Added: October 26, 2018

Inequalities for space-bounded Kolmogorov complexity

Bauwens B. F., Gács P., Romashchenko A. et al., Computability 2022 Vol. 11 No. 3-4 P. 165–185

Finding all linear inequalities for entropies remains an important open question in information theory. For a long time the only known inequalities for entropies of tuples of random variables were Shannon (submodularity) inequalities. Only in 1998 Zhang and Yeung 1998 found the first inequality that cannot be represented as a convex combination of Shannon inequalities, and ...

Added: December 23, 2022

On information content in certain objects

Vereshchagin N., / Series arXiv "math". 2024.

The fine approach to measure information dependence is based on the total conditional complexity CT(y|x), which is defined as the minimal length of a total program that outputs y on the input x. It is known that the total conditional complexity can be much larger than than the plain conditional complexity. Such strings x, y ...

Added: August 19, 2024

On Algorithmic Statistics for Space-bounded Algorithms

Milovanov A., Theory of Computing Systems 2019 Vol. 63 No. 4 P. 833–848

Algorithmic statistics looks for models of observed data that are good in the following sense: a model is simple (i.e., has small Kolmogorov complexity) and captures all the algorithmically discoverable regularities in the data. However, this idea can not be used in practice as is because Kolmogorov complexity is not computable. In this paper we ...

Added: October 17, 2018

Возможности моделирования предрасположенности к наркозависимости методами искусственного интеллекта

Yasnitsky L., Грацилев В. И., Куляшова Ю. С. et al., Вестник Пермского университета. Философия. Психология. Социология 2015 № 1 С. 61–71

A computer program designed to determine the degree human predisposition to drug addiction. The program is based neural network trained on the results of sociological surveys. Error of neural network model was less than 1%. With the help of neural network model evaluated the importance of factors that can influence the predisposition to drug addiction. ...

Added: February 20, 2016

Predictions, fast and slow

Sekerina I. A., Linguistic Approaches to Bilingualism 2015 No. 5 P. 532–536

Added: October 21, 2015

On Algorithmic Statistics for Space-Bounded Algorithms

Milovanov A., , in: Computer Science – Theory and Applications: 12th International Computer Science Symposium in Russia (CSR 2017)Vol. 10304.: Luxemburg: Springer, 2017. P. 232–244.

Algorithmic statistics studies explanations of observed data that are good in the algorithmic sense: an explanation should be simple i.e. should have small Kolmogorov complexity and capture all the algorithmically discoverable regularities in the data. However this idea can not be used in practice because Kolmogorov complexity is not computable. In this paper we develop algorithmic ...

Added: October 15, 2017

Stochasticity in Algorithmic Statistics for Polynomial Time

Vereshchagin N., Milovanov A., , in: 32nd Computational Complexity Conference.: Вадерн: Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, 2017. P. 1–18.

A fundamental notion in Algorithmic Statisticsis that of a stochastic object, i.e., an object having a simple plausible explanation. Informally, a probability distribution is a plausible explanation for x if it looks likely that x was drawn at random with respect to that distribution. In this paper, we suggest three definitions of a plausible statistical ...

Added: October 12, 2017

Algorithmic Statistics: Forty Years Later.

Vereshchagin N., Shen A., Lecture Notes in Computer Science 2017 Vol. 10010 P. 669–737

Added: February 13, 2017

Predictions of Chalcospinels with Composition ABCX4 (X = S or Se)

Kiselyova N. N., Dudarev V., Ryazanov V. V. et al., Inorganic Materials: Applied Research 2021 Vol. 12 No. 2 P. 328–336

New chalcospinels of the most common compositions were predicted: AIBIIICIVX4 (X = S or Se) and AIIBIIICIIIS4 (A, B, and C are various chemical elements). They are promising for the search for new materials for magneto-optical memory elements, sensors, and anodes in sodium-ion batteries. The parameter “a” values of their crystal lattice are estimated. When predicting, only the ...

Added: April 2, 2021

Kolmogorov complexity and algorithmic randomness

Shen A., Uspensky V. A., Vereshchagin N., American Mathematical Society, 2017.

Added: October 12, 2017

Fast Approximate Energy Minimization with Label Costs

Delong A., Osokin A., Isack H. et al., International Journal of Computer Vision 2012 Vol. 96 No. 1 P. 1–27

The α-expansion algorithm has had a significant impact in computer vision due to its generality, effectiveness, and speed. It is commonly used to minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main algorithmic contribution is an extension of α-expansion that also optimizes “label costs” with well-characterized optimality bounds. Label costs penalize a solution based ...

Added: October 18, 2017

Analysis of Twitter Users’ Mood for Prediction of Gold and Silver Prices in the Stock Market

Porshnev Alexander, Redkin Ilya, , in: Analysis of Images, Social Networks and TextsVol. 436: 3rd International Conference on Analysis of Images, Social networks, and Texts.: NY: Springer, 2014. Ch. 19 P. 190–197.

The question about possibilities to use Twitter users’ moods to increase accuracy of stock price movement prediction draws attention of many researchers. In this paper we examine the possibility of analyzing Twitter users’ mood to improve accuracy of predictions for Gold and Silver stock market prices. We used a lexicon-based approach to categorize the mood ...

Added: November 21, 2014

Plain stopping time and conditional complexities revisited

Posobin G. I., Shen A., Andreev M., , in: 43rd International Symposium on Mathematical Foundations of Computer Science (MFCS 2018)Vol. 117.: Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2018. P. 1–24.

In this paper we analyze the notion of "stopping time complexity", informally defined as the amount of information needed to specify when to stop while reading an infinite sequence. This notion was introduced by Vovk and Pavlovic (2016). It turns out that plain stopping time complexity of a binary string x could be equivalently defined as (a) ...

Added: October 11, 2018

Prediction in Regulating the Heat Treatment of Ferroconcrete

Chumachenko E., Zak A. M., Russian Engineering Research 2012 Vol. 32 No. 9-10 P. 651–654

Neural networks are applied to the longterm prediction of parameter variation. Regulation of the heat treatment of ferroconcrete is considered as an example. Experimental results are presented. An algorithm is proposed for planning the operation of the executive mechanism. ...

Added: February 11, 2014

Some Properties of Antistochastic Strings

Milovanov A., Theory of Computing Systems 2017 Vol. 61 No. 2 P. 521–535

Algorithmic statistics is a part of algorithmic information theory (Kolmogorov complexity theory) that studies the following task: given a finite object x (say, a binary string), find an `explanation' for it, i.e., a simple finite set that contains x and where x is a `typical element'. Both notions (`simple' and `typical') are defined in terms ...

Added: June 27, 2016