Query-Based Versus Tree-Based Classification: Application to Banking Data

A. Masyutin; Y. Kashnitsky

doi:10.1007/978-3-319-60438-1_65

Publications

?

Query-Based Versus Tree-Based Classification: Application to Banking Data

P. 664–673.

Masyutin A., Kashnitsky Y.

The cornerstone of retail banking risk management is the estimation of the expected losses when granting a loan to the borrower. The key driver for loss estimation is probability of default (PD) of the borrower. Assessing PD lies in the area of classification problem. In this paper we apply FCA query-based classification techniques to Kaggle open credit scoring data. We argue that query based classification allows one to achieve higher classification accuracy as compared to applying classical banking models and still to retain interpretability of model results, whereas black-box methods grant better accuracy but diminish interpretability.

Keywords: Formal Concept Analysis probability of default credit scoring classification kaggle

Publication based on the results of:

Explanation-oriented Methods of Data Analysis for Semantically Rich Data and Their Applications (2017)

In book

Foundations of Intelligent Systems

Kryszkiewicz M., Slezak D., Skowron A., Rybinski H., Appice A., Raś Z. Warsz.: Springer, 2017.

Terra incognita: йольские языки (langues d’oïl)

Бестолкова Г. В., Вестник Донецкого национального университета. Филология и психология 2025 № 6 С. 62–73

The research paper objective is to formulate and integrate the term “langues d’oïl” into contemporary Russian Romance studies’ conceptual framework in order to expand Romance languages’ scientific knowledge. In accordance with this goal, the given paper involves a number of objectives: term “langues d’oïl” formation’s historical background analysis; regional languages’ area description in modern France; comprehensive description of “French ...

Added: February 23, 2026

Территориальная вариативность окситанского языка: классификация северных диалектов

Бестолкова Г. В., Теория языка и межкультурная коммуникация 2023 № 3(50) С. 1–15

Significant role in modern Occitan language’s development is played by variety of dialects, subdialects and colloquial speech, that determines relevance of the study undertaken in this article. Occitan language dialects’ number is large, therefore only its northern dialects are considered in detail within this article. The material contained in the article allows to form a ...

Added: February 15, 2026

Is Canfield Right? On the Asymptotic Coefficients for the Maximum Antichain of Partitions and Related Counting Inequalities

Ignatov D. I., , in: 11th International Conference, AIST 2023, Yerevan, Armenia, September 28–30, 2023, Revised Selected Papers. Analysis of Images, Social Networks and Texts. Lecture Notes in Computer Science (LNCS, volume 14486).: Cham: Springer, 2024. P. 349 – 361.

This paper dates back to the asymptotic solutions of Rota’s problem on the size of maximum antichain in the set partition lattice by Canfield and Harper and others. The knowledge of asymptotic coefficients could pave the way to the asymptotic solutions of such problems as (maximal) antichain counting in partition lattices. In addition to our ...

Added: January 23, 2026

Classification Approach to Mapping Cultural Differences: An Illustration Using Survey Data from 60 Russian Regions

Nastina E., Sokolov B., / Series OSF "SocArXiv". 2025.

We argue that a classification-based approach to measuring cultural differences across countries or subnational regions is a promising complement, and sometimes an alternative, to the widely used dimensional method in cross-cultural research. The latter summarises cultural variation using continuous dimensions, for example, Hofstede’s famous individualism-collectivism dimension. However, this approach relies on strong parametric assumptions, which are ...

Added: December 23, 2025

Подходы к оценке дефолтности рейтинговых шкал кредитных рейтинговых агентств

Ozerov K., Кутенко С. В., Деньги и кредит 2024 Т. 83 № 4 С. 98–118

Under limited data, the classical cohort method for the creation of migration matrices does not fully reflect the dynamics of the credit quality of the objects within the sample. This problem is exacerbated for objects of lower credit quality less represented in the sample. This paper investigates a continuous time approach to the creation of ...

Added: December 20, 2025

The Impact of Alternative Data on Default Probability: Analyzing the Italian E-commerce Sector with NLP and Network Structures

Bernhardt B. D., Marciano C., Guarracino M. R., Operations Research Forum 2025 Vol. 6 Article 47

E-commerce is a key sector in the Italian economy, with online companies becoming some of the largest and most profitable businesses. However, this growth comes with increased risk exposure. This study aims to investigate the relationship between alternative data (contextual factors, Text-Driven Data Enrichment) and the probability of default for Italian e-commerce companies. To date, ...

Added: September 6, 2025

Абстрактные логики как структуры и классификации структур

Dragalina-Chernaya E., В кн.: Четырнадцатые Смирновские чтения по логике: материалы Междунар. науч. конф., Москва, 19-21 июня 2025 г.: М.: Издатель Александр Воробьев, 2025. С. 80–82.

В докладе сопоставляются истолкования абстрактных логик как структур и как классификаций абстрактных структур. ...

Added: June 20, 2025

Counterfactual explanations based on synthetic data generation

Yuri A. Zelenkov, Elizaveta V. Lashkevich, Business Informatics 2024 Vol. 18 No. 3 P. 24–40

A counterfactual explanation is the generation for a particular sample of a set of instances that belong to the opposite class but are as close as possible in the feature space to the factual being explained. Existing algorithms that solve this problem are usually based on complicated models that require a large amount of training data and significant ...

Added: October 13, 2024

Parametric methods for precision calibration of scoring models

Mikhail Pomazanov, Berezhnoy A., , in: Procedia Computer Science, Volume 242: 11th International Conference on Information Technology and Quantitative Management (ITQM 2024).: ScienceDirect, 2024. P. 348–355.

Once a scoring model has been developed for use in assessing a borrower’s credit risk, under the internal ratings-based (IRB) approach, it must be calibrated to a real-world measure of default frequency. The conservativeness of the calibration is tightly controlled, if it is not violated in the allowed number of digits of the rating scale, ...

Added: September 3, 2024

Теоретико-модельные логики как классификации дефинитных многообразий

Dragalina-Chernaya E., Логико-философские штудии 2024 Т. 21 № 4 С. 163–164

В статье предлагается истолкование теоретико-модельных (абстрактных) логик как классификаций дефинитных многообразий, которые представляют собой предмет логики как формальной онтологии в интерпретации Гуссерля. ...

Added: June 18, 2024

Определение минимального размера выборки для задачи экстраполяции резервов при наличии корреляции дефолтов

Penikas H. I., / Банк России. Серия доклады Банка России "Серия докладов об экономических исследованиях". 2024.

In 2016, the Bank of Russia developed two ordinances to set forth a procedure using a limited sample of loans to conclude whether the level of loss provisions in the portfolio of uniform loans is sufficient or not and whether the bank’s capital is adequate. The existing procedure of reserve sufficiency evaluation previews as a rule considering only a part of the loan portfolio and transfer (extrapolation) of the provision thus assessed for the overall portfolio. Moreover, the acting approach to define the minimum loan sample size assumes the absence of the default correlation. Author’s contribution ...

Added: June 1, 2024

Нерешенные проблемы применения документов по стандартизации в сфере закупок

Байрашев В. Р., Стандарты и качество 2024 № 5(1043) С. 25–29

The article reviews topical issues of using standardization documents on public and corporate procurement, including interrelations of national standards and others legislative requirements for description of goods, works and services, application of nomenclature and terminology of standardization legislation in procurement. The analysis of the order of using standardization documents allowed to make a conclusion about necessity ...

Added: May 6, 2024

Русские глоссы в немецком Маттиоли

Lifshits A., Святохина Е. В., В кн.: Вспомогательные исторические дисциплины в современном научном знании: Материалы XXXVI Всероссийской научной конференции с международным участием. Москва, 4–5 апреля 2024 г.: М.: ИВИ РАН, 2024. С. 220–221.

Тезисы доклада ...

Added: April 4, 2024

Classification of brain activity using Synolitic networks

D. V. Vlasenko, A. A. Zaikin, D. G. Zakharov, Izvestiya Vysshikh uchebnykh zavedeniy. Prikladnaya nelineynaya dinamika 2023 Vol. 31 No. 5 P. 661–669

Because the brain is an extremely complex hypernet of interacting macroscopic subnetworks, full-scale analysis of brain activity is a daunting task. Nevertheless, this task can be greatly simplified by analysing the correspondence between various patterns of macroscopic brain activity, for example, through functional magnetic resonance imaging (fMRI) scans, and the performance of particular cognitive tasks ...

Added: March 11, 2024

Symplectic Partially Hyperbolic Automorphisms of 6-Torus

L. M. Lerman, K. N. Trifonov, Journal of Geometry and Physics 2024 Vol. 195 No. 2 Article 105038

We study topological properties of automorphisms of a 6-dimensional torus $\T^6$ generated by integer matrices with simple eigenvalues being symplectic with respect to either the standard symplectic structure in $\R^6$ or a nonstandard symplectic structure given by an integer skew-symmetric non-degenerate matrix. Such a symplectic matrix generates a partially hyperbolic automorphism of the torus, if its eigenvalues lie both ...

Added: October 31, 2023

Classification Using Marginalized Maximum Likelihood Estimation and Black-Box Variational Inference

Shalileh S., , in: Data Analysis and Optimization. In Honor of Boris Mirkin's 80th Birthday.: Springer, 2023. P. 349–361.

Based upon variational inference (VI) a new set of classification algorithms has recently emerged. This set of algorithms aims (A) to increase generalization power, (B) to decrease computational complexity. However, the complex math and implementation considerations have led to the emergence of black-box variational inference methods (BBVI). Relying on these principles, we assume the existence ...

Added: October 17, 2023

Классификация мозговой активности при помощи синолитических сетей

Vlasenko D., Zaikin A., Zakharov D., Известия высших учебных заведений. Прикладная нелинейная динамика 2023 Т. 31 № 5 С. 661–669

Because the brain is an extremely complex hypernet of interacting macroscopic subnetworks, full-scale analysis of brain activity is a daunting task.Nevertheless,this task can be greatly simplified by analysing the correspondence between various patterns of macroscopic brain activity, forex ample,through functional magneticresonance imaging(fMRI) scans, and the performance of particular cognitive tasks or pathological states.The purpose of ...

Added: October 4, 2023

Abstract logics as formal ontologies as classifications

Dragalina-Chernaya E., , in: 4th Lisbon International Conference on Philosophy of Science - LICPOS 2023.: Lisbon: CFCUL, 2023. P. 27–28.

This paper offers an interpretation of abstract logics as formal ontologies as well as higher-level classifications. ...

Added: July 14, 2023

Проблема интерпретации, дифференциации и классификации цифровых продуктов

Shaidullin A., Бизнес-информатика 2023 Т. 17 № 2 С. 55–70

Digital innovative products often become a significant factor in the revision of companies’ business strategies and influence consumer preferences. A key component in the process of formulating such strategies is understanding the implications underlying the attributes of digital products. This requires a good understanding of their nature and characteristics. To date, there is no solid ...

Added: June 29, 2023

Происхождение и классификация политических идеологий: междисциплинарный подход

Казаков И. В., Политическая наука 2023 № 1 С. 322–337

Particular cases of disputes between actors on political issues can be traced to the differences in the ideologies that they follow. In this regard, when analyzing political discourse, it would be useful to attribute individual positions expressed by actors to their ideologies. However, in practice this proves to be difficult due to the shortcomings of ...

Added: December 10, 2022

Risk Management Tools to Improve the Efficiency of Lending to Retail Segments

Pomazanov M. V., , in: Risk Management, Sustainability and Leadership.: L.: IntechOpen, 2022. Ch. 6.

This chapter discusses the issue of assessing the quality of risk management for a wide segment of retail lending (from consumer loans to loans for self-employed persons and SMEs). The quality of risk management is assessed using the generally recognized approach of the ROC analysis methodology and assessment of the optimal level of discrimination, taking ...

Added: November 16, 2022