Analytical Distribution Model for Syntactic Variables Average Values in Russian literary Texts
Digital technologies provide new possibilities for studying cultural heritage. Thus, literature research involving large text corpora allows to set and solve theoretical problems which previously had no prospects for their decision. For example, it has become possible to model the literary system for some defi-nite literary period (i.e., for the Silver Age of Russian literature) and to classify prose writers according to their stylistic features. And more than that, it allows to solve more general theoretical problems. The given research was conducted on Russian literary texts of the early 20th century. The sample included 100 short stories by 100 different writers. The measurements were carried out for 5 syntactic variables. For each of these distributions, the most popular statistics were calculated. Basing on these data, we consider empirical verification of Lyapunov's central limit theorem (CLT). The article validates the effectiveness of CLT theorem and the conditions for its implementation. Besides the normal (Gaussian) function we used another analytical model — the Hausstein func-tion. It turned out that both theoretical distributions for each of five variables do not contradict the experimental data. However, the alternative analytical model (Hausstein function) has shown even better agreement with the experimental data. The obtained results may be used in computational linguistic studies and for research of Russian literary heritage.