Two-step classification method based on genetic algorithm for bankruptcy forecasting
By present, many models of bankruptcy forecasting have been developed, but this area remains a field of research activity; little is known about the practical application of existing models. In our opinion, this is because the use of existing models is limited by the conditions in which they are developed. Another question concerns the factors that can be significant for forecasting. Many authors suggest that indicators of the external environment, corporate governance as well as firm size contain important information; on the other hand, the large number of factors does not necessary increase predictive ability of a model. In this paper, we suggest the genetic algorithm based two-step classification method (TSCM) that allows both selecting the relevant factors and adapting the model itself to application. Classifiers of various models are trained at the first step and combined into the voting ensemble at the second step. The combination of random sampling and feature selection techniques were used to ensure the necessary diversity level of classifiers at the first step. The genetic algorithms are applied at the step of features selection and then at the step of weights determination in ensemble. The characteristics of the proposed method have been tested on the balanced set of data. It included 912 observations of Russian companies (456 bankrupts and 456 successful) and 55 features (financial ratios and macro/micro business environment factors). The proposed method has shown the best accuracy (0.934) value among tested models. It has also shown the most balanced precision-recall ratio. It found bankrupts (recall = 0.953) and not bankrupts (precision = 0.910) rather accurately than other tested models. The ability of method to select the task-relevant features has been also tested. Excluding the features that are significant for less than 50% of the classifiers in the ensemble improved the all performance metrics (accuracy = 0.951, precision = 0.932, recall = 0.965). So, the proposed method allows to improve the advantages and alleviate the weaknesses inherent in ordinary classifiers, enabling the business decisions support with a higher reliability.