Прогнозирование банкротства российских компаний: межотраслевое сравнение
The primary aim of this research is to compare diverse statistical models to predict critical financial state for Russian private small and medium-sized companies belonging to different sectors of economy.
We use the following methods: Linear Discriminant Analysis, Quadratic Discriminant Analysis, Mixture Discriminant Analysis, Logistic Regression, Probit Regression, Tree and Random Forest. Our dataset consists of approximately 1,000,000 observations from the Ruslana database and covers the period from 2011 to 2012.
Instead of standard definition of default we use the notion of critical financial state which means that we add companies liquidated as a result of legal bankruptcy to those liquidated voluntary.
We study four industries in detail: construction, manufacturing, real estate activities, retail and wholesale trade. Comparing industries, we come up to several compelling conclusions. On the one hand, the difference between sectors is so significant that it cannot be overcome by including several dummy variables but by estimating separate models for each industry.
On the other hand, sectors are similar in several ways. Firstly, importance ranking of regressors is stable among sectors that are analysed. This results in unique optimal set of variables chosen out of six possible alternatives. To add, inclusion of non-financial characteristics improves predictive power greatly. While age of a company and federal region are the key non-financial variables, size of a company is less important, and legal form is the weakest predictor. Secondly, Random Forest outperformed other statistical approaches on all data sets. For this method area under ROC-curve (the applied comparison criterion) reaches up to ¾ which is the same for all industries.
This research will be of vital importance especially to banks and other credit organisations providing loans to small and medium businesses as well as to state regulators.