Applying Complementary Credit Scores to Calculate Aggregate Ranking
Researchers have been improving credit scoring models for decades. The reason for this is following: an increase in the predictive ability of scoring even by small values can save a financial institution from a significant losses. As a result, many researchers have conclude that ensembles of classifiers or aggregated scorings have greater performance. However, ensembles outperform the base classifiers by thousandths of a percent on unbalanced samples.
This article suggests building an aggregated scoring. Unlike previously proposed aggregate scores, its baseline classifiers are focused on identifying different types of borrowers. The purpose of this study is to illustrate the effectiveness of such scoring aggregation on real unbalanced data.
We use one performance measure as effectiveness indicator - the area under the ROC curve. The DeLong, DeLong and Clarke-Pearson test is used to measure the statistical difference between the two or more areas. In addition, this study uses a logistic model of defaults (logistic regression), which is applied to the data of companies financial statements. This model is usually focused on identifying default borrowers. To obtain a scoring aimed at non-default borrowers, a modified Kemeny median is used, which was conceived by the authors to rank companies with credit ratings. Both scores are aggregated by logistic regression.
Our data contains most of the observations of Russian banks that existed and defaulted from 01.07.2010 to 01.07.2015. The sample of banks is highly unbalanced, a concentration of defaults is about 5%. However, the aggregation is carried out for the banks that have several ratings. As a result, it was found that aggregated classifiers based on different types of information improves significantly the discriminatory power of scoring even on an unbalanced sample.
This aggregated scoring and the approach to its construction could be applied in financial institutions as part of credit risk assessment, as well as an auxiliary tool for decision-making process because of relatively high interpretability of these scores.
In this study a CHAID-based approach to detecting classification accuracy heterogeneity across segments of observations is proposed. This helps to solve some important problems, facing a model-builder: (1) How to automatically detect segments in which the model significantly underperforms? and (2) How to incorporate the knowledge about classification accuracy heterogeneity across segments to partition observations in order to achieve better predictive accuracy? The approach was applied to churn data from the UCI Repository of Machine Learning Databases. By splitting the data set into four parts, which are based on the decision tree, and building a separate logistic regression scoring model for each segment we increased the accuracy by more than 7 percentage points on the test sample. Significant increase in recall and precision was also observed. It was shown that different segments may have absolutely different churn predictors. Therefore such a partitioning gives a better insight into factors influencing customer behavior.
Procedure for the simulation of the advances in EGE from mathematics is considered. For some tasks the important predictors are obtained. The models of binary logistics regression and ordinal regression for the prediction of probabilities of solution of task are built.
Most of existing scoring systems are based on binary choice models with sample selection. This setting does not allow for up-to-date information about loans to be used and a lot of observations becomes lost. In the paper a model of binary choice with sample selection is extended to the case of many periods. This extension allows for defaults to be modeled for each period that solves the problem of lost observations. This setting also can be used to estimate the effectiveness of existing scoring system of a bank. The model is estimated using data granted by one of commercial banks of Nizhny Novgorod. Sample consists of observations from January 2009 to March 2012.
This book constitutes the refereed proceedings of the 6th IAPR TC3 International Workshop on Artificial Neural Networks in Pattern Recognition, ANNPR 2014, held in Montreal, QC, Canada, in October 2014. The 24 revised full papers presented were carefully reviewed and selected from 37 submissions for inclusion in this volume. They cover a large range of topics in the field of learning algorithms and architectures and discussing the latest research, results, and ideas in these areas.
This paper considers the problem of choosing optimal set (subset) of the descriptive variables (regressors) from a fixed set of candidates. Forward Selection and Backward Elimination methods adding/removing a candidate in/from the current set of descriptive variables step-by-step. Each variable is tested to be included or excluded using a chosen model comparison criteria that improves the model the most, and this process repeated until none improves the model. The model selection criteria may be calculated directly or recursively. Algorithms for recursive computing of the residuals sum of squares (RSS) for the model selection criteria in the recursive least squares method are presented. This paper evaluates the computational costs of the recursive calculation of stepwise model selection criteria for all possible steps of selection.
The paper examines the structure, governance, and balance sheets of state-controlled banks in Russia, which accounted for over 55 percent of the total assets in the country's banking system in early 2012. The author offers a credible estimate of the size of the country's state banking sector by including banks that are indirectly owned by public organizations. Contrary to some predictions based on the theoretical literature on economic transition, he explains the relatively high profitability and efficiency of Russian state-controlled banks by pointing to their competitive position in such functions as acquisition and disposal of assets on behalf of the government. Also suggested in the paper is a different way of looking at market concentration in Russia (by consolidating the market shares of core state-controlled banks), which produces a picture of a more concentrated market than officially reported. Lastly, one of the author's interesting conclusions is that China provides a better benchmark than the formerly centrally planned economies of Central and Eastern Europe by which to assess the viability of state ownership of banks in Russia and to evaluate the country's banking sector.
The paper examines the principles for the supervision of financial conglomerates proposed by BCBS in the consultative document published in December 2011. Moreover, the article proposes a number of suggestions worked out by the authors within the HSE research team.