Statistical analysis of communication services market in Russia
In this article we examine factors affecting fortified wine consumption in Russia by utilizing micro-level data from the Russian Longitudinal Monitoring Survey (RLMS). A model with limited dependent variables has been applied to the study. Our analysis shows that Russian males demonstrate a persistent propensity to fortified wine consumption due to its higher alcohol content. Our finding reflects the presence of diminishing marginal effect by age, while the estimated coefficient for marital status is negatively significant. Respondents from southern regions do not opt for fortified wine. One explanation of this might be that Krasnodar Province located in the South federal district is known as one of Russia׳s major wine producers.
Procedure for the simulation of the advances in EGE from mathematics is considered. For some tasks the important predictors are obtained. The models of binary logistics regression and ordinal regression for the prediction of probabilities of solution of task are built.
The article studies educational trajectories of schoolchildren in Yaroslavl Oblast. Conclusions point out that schoolchild’s educational achievements, educational plans of his/her friends and the level of education of his/her father are key predictors of a decision about continuing education. Thanks to this information it is possible to know which schoolchildren are at risk of not continuing their studies. In the course of the research comparative advantages of the logistic regression and the discriminant analysis in the case of binary dependent variables were examined. With the necessary prerequisites for the use of methods fulfilled, both strategies work well classifying schoolchildren.
Classical change-point detection procedures assume a change-point model to be known and a change consisting in establishing a new observations regime, i.e. the change lasts infinitely long. These modeling assumptions contradicts applied problems statements. Therefore, even theoretically optimal statistics in practice very often fail when detecting transient changes online. In this work in order to overcome limitations of classical change-point detection procedures we consider approaches to constructing ensembles of change-point detectors, i.e. algorithms that use many detectors to reliably identify a change-point. We propose a learning paradigm and specific implementations of ensembles for change detection of short-term (transient) changes in observed time series. We demonstrate by means of numerical experiments that the performance of an ensemble is superior to that of the conventional change-point detection procedures.
In this study a CHAID-based approach to detecting classification accuracy heterogeneity across segments of observations is proposed. This helps to solve some important problems, facing a model-builder: (1) How to automatically detect segments in which the model significantly underperforms? and (2) How to incorporate the knowledge about classification accuracy heterogeneity across segments to partition observations in order to achieve better predictive accuracy? The approach was applied to churn data from the UCI Repository of Machine Learning Databases. By splitting the data set into four parts, which are based on the decision tree, and building a separate logistic regression scoring model for each segment we increased the accuracy by more than 7 percentage points on the test sample. Significant increase in recall and precision was also observed. It was shown that different segments may have absolutely different churn predictors. Therefore such a partitioning gives a better insight into factors influencing customer behavior.