?
Explainable Machine Learning for Sequences of Demographic Statuses
P. 358–367.
The article presents a case study on demographic sequences analysis through modern machine learning (ML)
techniques. The studied data contains demographic and socioeconomic events, where the events are presented
as sequences of statuses. The involved demographers are interested in applications of advanced ML techniques
and interpretable patterns for their needs. We show how Shapley value-based explanations can be obtained for
such sequential data with powerful ML approach, namely gradient boosting over decision trees. Thus, it helps
to understand the critical and influencing events for a particular individual life-course sequence and explain
predictions.
Publication based on the results of:
In book
Vol. 212. , Elsevier, 2022.
Glushko A., Neznanov A., В кн.: Перспективные материалы и технологии (ПМТ-2025) : Сборник докладов Национальной научно-технической конференции с международным участием, Москва, 07–12 апреля 2025 года.: М.: РТУ МИРЭА, 2025. С. 651–657.
Most of scientific research involves the use of computational tools, while researchers have low IT competences. The efficiency of working with computational modules for scientific research can be significantly increased by guaranteeing reproducibility, interactivity, and reusability of producing artifacts. This can be achieved by using a reactive environment with a standardized set of UI controls, ...
Added: April 29, 2026
Zelenkov Y., , in: 2023 Ivannikov ISPRAS Open Conference (ISPRAS).: IEEE, 2023. P. 176–182.
We propose a regression ensemble based on a decomposition that separates the weighted average errors of individual learners and the ambiguity of their estimates. This approach is a modification of Gradient Boosting with a variation of the gradient at each step. That allows ensuring explicitly a diversity of base estimators. In addition, the proposed approach ...
Added: May 1, 2024
Bukina T. V., Kashin D., Экономический журнал Высшей школы экономики 2024 Т. 28 № 1 С. 81–107
The paper reveals the forecasts for regional inflation based on the regions of the Privolzhskiy Federal District (PFD). The purpose of the study is to determine the model that most accurately predicts regional inflation. The paper compares the tools of machine learning – support vector machines, gradient boosting, and random forest – with econometric models ...
Added: February 13, 2024
Ignatov D. I., Kwuida L., Annals of Mathematics and Artificial Intelligence 2022 Vol. 90 No. 11 P. 1197–1222
We propose the usage of two power indices from cooperative game theory and public choice theory for ranking attributes of closed sets, namely intents of formal concepts (or closed itemsets). The introduced indices are related to extensional concept stability and are also based on counting of generators, especially of those that contain a selected attribute. ...
Added: January 31, 2023
Suvorova A., , in: Digital Transformation and Global Society. 6th International Conference, DTGS 2021, St. Petersburg, Russia, June 23–25, 2021, Revised Selected Papers.: Springer, 2022. P. 319–331.
The increasing use of intelligent technologies, the development and implementation of machine learning systems in various spheres of life require explaining machine learning-based decisions in such systems. This need for interpretation leads to the increasing development of new methods for interpreting machine learning models and their more intense use in real systems. The paper reviews ...
Added: September 28, 2022
Ustimenko A., Prokhorenkova L., , in: Proceedings of the 38th International Conference on Machine Learning (ICML 2021)Vol. 139.: PMLR, 2021. P. 1–10.
This paper introduces Stochastic Gradient Langevin Boosting (SGLB) - a powerful and efficient machine learning framework that may deal with a wide range of loss functions and has provable generalization guarantees. The method is based on a special form of the Langevin diffusion equation specifically designed for gradient boosting. This allows us to theoretically guarantee ...
Added: August 6, 2021
Малинин А. А., Prokhorenkova L., Ustimenko A., , in: Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). ICLR, 2021.: ICLR, 2021..
Added: August 2, 2021
Ivanov S., Prokhorenkova L., , in: Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). ICLR, 2021.: ICLR, 2021..
Added: August 2, 2021
Liudmila Prokhorenkova, Ustimenko A., , in: International Conference on Machine Learning (ICML 2020)Vol. 119.: PMLR, 2020. P. 9669–9679.
In this paper, we introduce a powerful and efficient framework for direct optimization of ranking metrics. The problem is ill-posed due to the discrete structure of the loss, and to deal with that, we introduce two important techniques: stochastic smoothing and novel gradient estimate based on partial integration. We show that classic smoothing approaches may ...
Added: January 14, 2021
Antipov E. A., Pokryshevskaya E. B., Journal of Revenue and Pricing Management 2020 No. 19 P. 355–364
Forecasting demand and understanding sales drivers are one of the most important tasks in retail analytics. However, traditionally, linear models and/or models with a small number of predictors have been predominantly used in sales modeling. Taking into account that real-world demand is naturally determined by complex substitution and complementation patterns among a large number of ...
Added: October 31, 2020
Ignatov D. I., Kwuida L., , in: Ontologies and Concepts in Mind and Machine. 25th International Conference on Conceptual Structures, ICCS 2020.: Springer, 2020. P. 90–102.
Among the family of rule-based classification models, there are classifiers based on conjunctions of binary attributes. For example, the JSM-method of automatic reasoning (named after John Stuart Mill) was formulated as a classification technique in terms of intents of formal concepts as classification hypotheses. These JSM-hypotheses already represent an interpretable model since the respective conjunctions ...
Added: October 30, 2020
Ignatov D. I., Kwuida L., , in: Proceedings of the Fifthteenth International Conference on Concept Lattices and Their ApplicationsVol. 2668.: CEUR-WS.org, 2020. P. 259–271.
We propose the usage of two power indices from cooperative game theory and public choice theory for ranking attributes of closed sets, namely intents of formal concepts (or closed itemsets). The introduced indices are related to extensional concept stability and based on counting generators, especially those that contain a selected attribute. The introduction of such ...
Added: October 30, 2020
Liudmila Prokhorenkova, Gusev G., Vorobev A. et al., , in: Advances in Neural Information Processing Systems 31 (NeurIPS 2018).: Neural Information Processing Systems Foundation, 2018. P. 6638–6648.
This paper presents the key algorithmic techniques behind CatBoost, a new gradient boosting toolkit. Their combination leads to CatBoost outperforming other publicly available boosting implementations in terms of quality on a variety of datasets. Two critical algorithmic advances introduced in CatBoost are the implementation of ordered boosting, a permutation-driven alternative to the classic algorithm, and ...
Added: May 1, 2020
Timur Kadyrov, Ignatov D. I., , in: Proceedings of the Fifth International Workshop on Experimental Economics and Machine Learning (EEML 2019),Perm, Russia, September 26, 2019Vol. 2479.: CEUR Workshop Proceedings, 2019. P. 77–88.
A multichannel attribution model based on gradient boost- ing over trees is proposed, which was compared with the state of the art models: bagged logistic regression, Markov chains approach, shapely value. Experiments on digital advertising datasets showed that the pro- posed model is better than the solutions considered by ROC AUC metric. In addition, the ...
Added: January 20, 2020
Kuralenok I., Ershov V., Лабутин И. Н., , in: Advances in Neural Information Processing Systems 32 (NeurIPS 2019).: [б.и.], 2019. P. 1–10.
In this work, we introduce a new decision tree ensemble representation framework: instead of using a graph model we transform each tree into a well-known polynomial form. We apply the new representation to three tasks: theoretical analysis, model reduction, and interpretation. The polynomial form of a tree ensemble allows a straightforward interpretation of the original ...
Added: December 27, 2019
Bulychev A., Сомов О. Д., В кн.: Информатика, управление и системный анализ: Труды V Всероссийской научной конференции молодых ученых с международным участием.: Ростов н/Д: Ростовский государственный экономический университет "РИНХ", 2018. С. 94–102.
In the process of developing an information system for logistics transportation, there is a need to determine the initial rating of the new carrier within the parent company. The presence of the rating helps to more accurately carry out the formation of orders and build forecasts of its interaction with the parent company in the ...
Added: September 3, 2019
Kulagin M., Sidorenko V., , in: Proceedings of the Third International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’18) Volume 2Vol. 2.: Springer, 2019. P. 308–316.
This article describes modern methods of data processing regarding the task of assessing activities of transportation employees. The main purpose was to find dependencies in data and construct an algorithm for predicting the probability of transport safety violation by employee. The research was conducted for locomotive drivers. The following algorithms were used: neural networks, gradient boosting over decision trees and ...
Added: January 19, 2019
Gizdatullin D., Ignatov D. I., Mitrofanova E. et al., , in: 14th International Conference on Formal Concept Analysis - Supplementary Proceedings.: University Rennes 1, 2017. P. 49–66.
This paper presents recent results of studies in application of sequence-based pattern structures and emerging patterns to analysis of demographic sequences in Russia. This study is performed on data of 11 generations from 1930 till 1984 for the panel of three waves of the Russian part of Generation and Gender Survey, which took place in ...
Added: June 20, 2017