Query-Based Versus Tree-Based Classification: Application to Banking Data
The cornerstone of retail banking risk management is the estimation of the expected losses when granting a loan to the borrower. The key driver for loss estimation is probability of default (PD) of the borrower. Assessing PD lies in the area of classification problem. In this paper we apply FCA query-based classification techniques to Kaggle open credit scoring data. We argue that query based classification allows one to achieve higher classification accuracy as compared to applying classical banking models and still to retain interpretability of model results, whereas black-box methods grant better accuracy but diminish interpretability.