Comparative Analysis of Predictive Analytics Models in Classification Problems

L. Zhukova; K. Polyakov

doi:10.1109/APSSE47353.2019.00028

Publications

?

Comparative Analysis of Predictive Analytics Models in Classification Problems

P. 162–169.

Zhukova L., Polyakov K.

Present research is devoted to the comparative analysis of the quality of classification for some methods of descriptive and predictive analytics in the case when most (or all) of independent variables are measured in quality scale with large amount of levels. In this case, some classification methods or their popular realizations calls for conversion of quality variables into systems of dummy variables. If quality scales have large amount of levels which are presented in almost equal proportions in the training set, i.e. it doesn't make sense to enlarge levels, above mentioned requirement will lead to the dramatically rise of problem dimension. As a result, researcher is faced with the curse of dimensionality. It means that, if the problem dimension rise, it'll be necessary to rise the sample size to preserve factors impact estimation accuracy. At the same time, it's not always possible to arrange appropriate growth of the training set volume. In some cases, it's limited by specific properties of the body of interest (system). If such situation appears, it'll be extremely important to evaluate the sensitivity of prediction/classification methods to the curse of dimensionality. Authors of this research focused on the four method of classification, which earn first lines in the lists of the popular methods of business analysis long ago. There are: • Two methods of classification tree building — CART and C4.5 • Logistic regression • Classification on the basis of random forest The first three are descriptive methods, which let's get interpreting (man ready) models, the fourth belongs to predictive analytics. Selection is not random. Descriptive analytics problems extremely important for the process of planning, when it's necessary to get answer on the question "What will be if …?". Particularly, one need to get target group description for organization of marketing communication. At the same time, it is quite conceivable that utilization of interpreting (man ready) models involves loss of prediction quality in comparison with methods of predictive analytics. The current research domain is the activity of microfinancing institutions (MFIs). Traditional problem here is the potential client assessment. The main challenge, which arise in the process of above mentioned problem solution, is the constraints on the volume, composition and type of data, which is available for prediction of default or default probability assessment. Thus, it's necessary to evaluate the abilities of classification methods which were designed for work with large amount of data (it means big size of the training set and a lot of variables, from which the most important should be selected). In real practice of microfinancing organization, the most of recorded factors are measured on the qualitative scales with large amount of levels, what is the origin of the above-mentioned problems. The empirical part of the research is grounded on the data of real microfinancing organization. Some hypotheses about the reasons of default were tested as byproduct of this research.

Keywords: predictive analytics Микрофинансовые организации предиктивная аналитика microfinance organization описательная аналитика descriptive Analytics social media analysis

In book

Actual Problems of Systems and Software Engineering APSSE 2019 (Invited Papers)

Los Alamitos, Washington, Tokyo: IEEE Computer Society, 2019.

Caspian Pipeline Consortium Equipment Monitoring System

Belov A. V., Irina V. Arutyunova, , in: Proceedings of the 2025 INTERNATIONAL CONFERENCE "QUALITY MANAGEMENT, DIGITAL SECURITY, INFORMATION TECHNOLOGIES" (2025 QM&DS&IT).: IEEE, 2025. P. 169–173.

The equipment condition assessment system of the Caspian Pipeline Consortium is purposed to enhance the efficiency of monitoring and to manage the operation of pipeline systems. The primary objective of this initiative is to develop an information system that enables continuous monitoring of pipeline conditions, thereby minimizing maintenance costs and reducing risks associated with potential ...

Added: April 21, 2026

Enhancing the Effectiveness of Management Decisions in Public Transport through Multi-Agent Technologies

Trofimov S., Ymer 2025 Vol. 24 No. 11 P. 1–18

This paper presents the development of a multi-agent system for managing the technical condition of bus fleets in public transport enterprises. The proposed solution aims to enhance management decision-making processes in vehicle maintenance through the implementation of intelligent agents and data analytics. The system is based on the interaction of two types of agents: "Bus," which collects ...

Added: January 24, 2026

Optimization of business processes using artifical intelligence-based automated control systems

D. Pshychenko, Вестник Воронежского института высоких технологий 2024 Vol. 18 No. 3 Article 14

The article discusses the optimization of business processes using automated control systems based on artificial intelligence (AI). Modern methods such as machine learning, robotic process automation, natural language processing, and predictive analytics are examined, which enhance operational efficiency, improve forecasting accuracy, and minimize errors. The advantages of implementing AI in various business sectors are analyzed, including the automation of ...

Added: March 10, 2025

Evaluation of the effectiveness of implementing AI-based CRM systems

Pshichenko D., Инновационная наука 2024 No. 7-2 P. 40–45

This paper evaluates the effectiveness of AI-based Customer Relationship Management (CRM) systems compared to traditional CRM systems. It examines the impact of AI integration on business operations, focusing on automation and personalization. A comparative analysis highlights significant advantages of AI-CRM systems in terms of customer satisfaction and operational efficiency. The study also addresses challenges, including high implementation costs and ...

Added: March 10, 2025

Прагматическая аналитика

Kuzminov I., Алешин В. А., Ананян С. М. et al., РПП ИНЭС/ МНИИПУ, 2023.

Монография подготовлена по материалам конференций, проводимых Ассоциацией «Аналитика¬ в 2017-2022 годах, а также Первого евразийского аналитического форума 2019 г. и Второго евразийского аналитического форума 2021 г. Материалы монографии имеют практическую направленность, содержат концептуальные основы прагматической аналитики и успешные практики видов аналитической деятельности по разрешению проблемной ситуации в различных сферах деятельности: описательной (дискриптивной) аналитики; диагностической аналитики; прогностической ...

Added: November 13, 2024

Гнев, идентичность или вера в успех? Динамика мотивации и участия в белорусских протестах 2020 года

Akhremenko A. S., Petrov A., Полис. Политические исследования 2023 № 2 С. 138–153

In this paper, we present our estimates of the motivational and participative dynamics in the protest against the announced results of the presidential elections in the Republic of Belarus in 2020. The campaign is analyzed throughout the whole period of its active development: from August to December. Based on contemporary achievements in the social psychology of protest movements, ...

Added: April 19, 2023

Predicting customer churn based on changes in their behavior patterns

Yury A. Zelenkov, Angelina S. Suchkova, Business Informatics 2023 Vol. 17 No. 1 P. 7–17

Customer retention is one of the most important tasks of a business, and it is extremely important to allocate retention resources according to the potential profitability of the customer. Most often the problem of predicting customer churn is solved based on the RFM (Recency, Frequency, Monetary) model. This paper proposes a way to extend the ...

Added: April 2, 2023

Социальные сети как источник информации об отношении россиян к коррупции

Maksimenko A., Krylova D., Общество: социология, психология, педагогика 2023 № 1 С. 69–79

The authors attempt to study the attitudes of Russians towards corruption by analyzing information obtained from the social media platform “VKontakte”. Based on a review of English-language sources and publica-tions by Russian scholars, a number of assumptions are made about the relationship between personal and socio-demographic characteristics and intolerance for corruption offences. The analysis of ...

Added: January 26, 2023

The use of predictive modeling for assessing college readiness

Bryer J., Akhmedjanova D., Andrade H. et al., , in: Enhancing Effective Instruction and Learning Using Assessment Data.: Information Age Publishing Inc., 2022. Ch. 5 P. 83–108.

Chapter 5 introduces software for automating some aspects of developmental education and the use of predictive modeling. ...

Added: October 31, 2022

ЭКСПЕРИМЕНТАЛЬНОЕ ИССЛЕДОВАНИЕ РАБОТЫ ПРЕДИКТИВНЫХ МОДЕЛЕЙ В СОСТАВЕ IOT-ПЛАТФОРМ

Gorshkov O., Kychkin A., В кн.: Межвузовская научно-техническая конференция студентов, аспирантов и молодых специалистов имени Е.В. Арменского. Материалы конференции.: М.: МИЭМ НИУ ВШЭ, 2021. С. 272–274.

Added: August 11, 2022

Application of Predictive Analytics to Sales Planning Business Process of FMCG Company

Panvlyuchenko, K., Panfilov, P., , in: MEDES '21: Proceedings of the 13th International Conference on Management of Digital EcoSystems.: NY: Association for Computing Machinery (ACM), 2021. P. 167–170.

Many organizations inside FMCG industry tries to improve their business process and make them more effective, by implementing different technologies, one of the is predictive analytics. This technology used in spheres like planning and forecasting product sales. Because market trend is very unstable, company needs to know how much to produce. This work focuses on ...

Added: January 15, 2022

Predictive analytics of digital conflicts: a neural network approach

Kharlamov A. A., Pilgun M., , in: XII Международная научно-техническая конференция "Нейроинформатика-2020". Сборник научных трудов". М.: МИФИ, 2020.: М.: [б.и.], 2020.

Predictive analytics of digital conflicts: a neural network approach ...

Added: January 8, 2022

Аналитика данных для формирования управленческих решений в образовании

Zair-Bek S. I., Mertsalova T., В кн.: Большие данные в образовании: доказательное развитие образования. Сборник научных статей II Международной конференции, 15 октября 2021 года, Москва.: Издательский дом «Дело» РАНХиГС, 2021. Гл. 4 С. 186–210.

В статье представлены современные подходы к использованию доказательной аналитики при работе с данными для принятия управленческих решений на разных уровнях. Источником стали материалы исследований, проведенных авторами статьи за последние пять лет. Представленные данные и технологии аналитической работы с ними демонстрируют значительную динамику в условиях принятия управленческих решений с использованием результатов аналитических процедур. При этом обращается ...

Added: November 22, 2021

Банкротство граждан в России: в июне ускорился рост числа судебных решений

Юхнин А. В., Имущественные отношения в Российской Федерации 2020 № 8 С. 81–85

В статье рассматриваются причины увеличения числа неплатежеспособных потребителей в России. Делается вывод о том, что тенденция основана не сколько на экономических факторах, сколько на спросе на процедуру как способ законного освобождения от долгов и реакции на развитие технологий со стороны профессионалов рынка и управляющих неплатежеспособностью. По мнению автора, на региональную динамику влияют уровень финансовой грамотности ...

Added: November 17, 2021

Рынок микрофинансирования России: тенденции и барьеры развития

Балихина Н. В., Kosov M., Финансовая жизнь 2019 № 2 С. 84–87

Microfinance institutions play an essential role in economy of Russia. The purpose of work is determination of the current and expected state of the market of microfinance in Russia. Positive dynamics of growth of the loan portfolio of microfinance institutions with simultaneous reduction of number of microfinance institutions is by results revealed. The problem of ...

Added: September 2, 2021

Examining private sector strategies for preventing insurance fraud

Timofeyev Y., Skidmore M., , in: The Handbook of Security, 3rd ed.: Palgrave Macmillan, 2022. Ch. 12 P. 239–260.

Added: April 24, 2021

Реализация сервиса описательной аналитики для IoT-платформы промышленного предприятия

Markvirer V., Щелкунов А. А., Deryabin A. I., В кн.: Материалы конференции. Межвузовская научно-техническая конференция студентов, аспирантов и молодых специалистов им. Е.В. Арменского.: М.: МИЭМ НИУ ВШЭ, 2020. С. 202–204.

С развитием концепции «Индустрия 4.0» первостепенное значение приобретают цифровая трансформация предприятий, использование новых методов управления на основе технологий Интернета вещей (Internet of Things, IoT) и анализа больших данных (BigData). Использование IoT-платформ в разы увеличило объем генерируемой информации. Согласно прогнозам, к 2025 году объем всех данных во всем мире составит 175 зеттабайт (ЗБ) по сравнению с ...

Added: December 29, 2020

About some issues of developing Digital Twins for the intelligent process control in quarries

Deryabin S., Temkin I., Zykov S. V., , in: Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 24th International Conference KES2020Vol. 176.: Elsevier, 2020. P. 3210–3216.

The present work is devoted to the problems of Digital Twin development of industrial enterprises in the field of mining. The main goal of this article is to formulate the principles of designing platform solutions for the integration of the most important functional elements that ensure the implementation of technological processes of a full production ...

Added: November 21, 2020

Application of the principal component analysis to detect semantic differences during the content analysis of social networks

Rytsarev I. A., Kozlov D., Kravtsova N. S. et al., , in: Data Science. Information Technology and Nanotechnology 2018Issue 2212.: CEUR Workshop Proceedings, 2018. P. 262–269.

In this paper, we propose an approach to semantic differences detection in texts presented in the form of frequency dictionaries. The original text data has been obtained by collecting records on various online communities. We have implemented a specialized software module that allows us to analyze and download both posts and comments from the social ...

Added: October 9, 2020

Method of Digital Counterpart Creation of Physical Processes at Productive Foresight Modeling of Cyber-Physical Systems

Kofanov Y. N., Sotnikova S., , in: 2020 Moscow Workshop on Electronic and Networking Technologies (MWENT).: IEEE, 2020. P. 1–5.

The authors propose to provide for the use of digital counterparts of physical processes in the electronic equipment of cyber-physical systems in their design for a long duration of active existence, for example, spacecraft in orbit or in a long flight. At the same time, it is proposed to create a database of big data ...

Added: May 9, 2020

Опыт моделирования вероятности кредитного дефолта клиентов микрофинансовых организаций (на примере одной МФО)

Zhukova L., Polyakov K., Экономический журнал Высшей школы экономики 2019 Т. 4 № 23 С. 497–523

Микрофинансовые организации получили большое распространение в кризисные годы, выдавая микрокредиты (до 100000 рублей) под большие проценты практически без документов. Сегодня ЦБ РФ активно регулирует этот рынок, все больше и больше ужесточая требования, ограничивая ставки и пени на выданные кредиты. Это вызывает необходимость формирования новой стратегии оценки риска невозврата выданного кредита или займа, построенной на предотвращении просрочек со стороны клиентов. Для ...

Added: December 15, 2019