• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • MonoForest framework for tree ensemble analysis
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 5, 2026
Neural Network Maps as a Method for Constructing Mathematical Models
Scientists from HSE University–Nizhny Novgorod and the Institute of Physics Belgrade, Serbia, are jointly exploring the application of machine learning techniques and neural networks to the study of nonlinear dynamics. Natalya Stankevich, Leading Research Fellow at the Laboratory of Topological Methods in Dynamics of the Faculty of Informatics, Mathematics, and Computer Science at HSE University–Nizhny Novgorod, spoke to the HSE News Service about this international project.
June 5, 2026
‘In the Age of Technology, It Is Interesting to Look into the Past and Think about What We Can Take from It
Polina Tabakova decided to apply for a Philology degree at HSE in Nizhny Novgorod because she grew up in Mari El and did not want to move far away from the Russian forests. In an interview for the Young Scientists of HSE University project, she spoke about the genre of the campus novel, the existential drama of Kolobok, and a blackout version of Eugene Onegin.
June 5, 2026
HSE Scientists Develop Method to Compress Large Language Models Without Losing Quality
Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science have developed a new compression method for large language models such as GPT and LLaMA that reduces their size by 25–36% without additional training or significant loss of accuracy. This is the first approach to use mathematical transformations—specifically, rotations of model weights—to make models more amenable to compression with structured matrices. The study results have been published in ACL Findings 2025. The code is available on GitHub.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

MonoForest framework for tree ensemble analysis

P. 1–10.
Kuralenok I., Ershov V., Лабутин И. Н.

In this work, we introduce a new decision tree ensemble representation framework: instead of using a graph model we transform each tree into a well-known polynomial form. We apply the new representation to three tasks: theoretical analysis, model reduction, and interpretation. The polynomial form of a tree ensemble allows a straightforward interpretation of the original model. In our experiments, it shows comparable results with state-of-the-art interpretation techniques. Another application of the framework is the ensemble-wise pruning: we can drop monomials from the polynomial, based on train data statistics. This way we reduce the model size up to 3 times without loss of its quality. It is possible to show the equivalence of tree shape classes that share the same polynomial. This fact gives us the ability to train a model in one tree's shape and exploit it in another, which is easier for computation or interpretation. We formulate a problem statement for optimal tree ensemble translation from one form to another and build a greedy solution to this problem.

Language: English
Full text
Text on another site
Keywords: деревья решенийdecision treesGradient boostingградиентный бустинг

In book

Advances in Neural Information Processing Systems 32 (NeurIPS 2019)
[б.и.], 2019.
Similar publications
Study of artificial intelligence models for big data analysis in project management
Pshichenko D., International Journal of Humanities and Natural Sciences 2024 Vol. 8-3(95) P. 180–185
This study explores the application of artificial intelligence (AI) and machine learning (ML) models for big data analysis in project management. By leveraging specific ML algorithms such as decision trees, random forests, support vector machines, neural networks, kmeans clustering, gradient boosting, and natural language processing, project management practices are significantly enhanced. These technologies improve decision-making, ...
Added: March 10, 2025
Managing Ambiguity in Regression Ensembles
Zelenkov Y., , in: 2023 Ivannikov ISPRAS Open Conference (ISPRAS).: IEEE, 2023. P. 176–182.
We propose a regression ensemble based on a decomposition that separates the weighted average errors of individual learners and the ambiguity of their estimates. This approach is a modification of Gradient Boosting with a variation of the gradient at each step. That allows ensuring explicitly a diversity of base estimators. In addition, the proposed approach ...
Added: May 1, 2024
Прогнозирование региональной инфляции: эконометрические модели или методы машинного обучения?
Bukina T. V., Kashin D., Экономический журнал Высшей школы экономики 2024 Т. 28 № 1 С. 81–107
The paper reveals the forecasts for regional inflation based on the regions of the Privolzhskiy Federal District (PFD). The purpose of the study is to determine the model that most accurately predicts regional inflation. The paper compares the tools of machine learning – support vector machines, gradient boosting, and random forest – with econometric models ...
Added: February 13, 2024
О выразительных возможностях ансамблей решающих деревьев
Соколов А. П., Прохоренкова Л. А., Интеллектуальные системы. Теория и приложения 2023 Т. 27 № 1 С. 18–23
Решающие деревья широко применяются в машинном обуче нии, статистике и анализе данных. Предиктивные модели, осно ванные на решающих деревьях, показывают отличные результаты в терминах точности и времени обучения, особенно на гетерогенных табличных датасетах. Производительность, простота и надежность делают это семейство алгоритмов одним из наиболее популярных в машинном обучении и науке о данных. Одним из важных гиперпараметров алгоритмов, основанных на решающих деревьях, является максимальная ...
Added: February 11, 2024
Machine learning methods for demographic data analysis
Muratova A., Ignatov D. I., Mitrofanova E., , in: Recent Trends in Analysis of Images, Social Networks and Texts. 9th International Conference, AIST 2020, Skolkovo, Moscow, Russia, October 15–16, 2020 Revised Supplementary ProceedingsVol. 12602.: Springer, 2021. P. 297–299.
This is the extended abstract of a case study on demographic sequences analysis by machine learning and data mining methods. ...
Added: November 1, 2022
Explainable Machine Learning for Sequences of Demographic Statuses
Muratova A., Mitrofanova E., Islam R., , in: Procedia Computer Science: 11th International Young Scientist Conference on Computational ScienceVol. 212.: Elsevier, 2022. P. 358–367.
The article presents a case study on demographic sequences analysis through modern machine learning (ML) techniques. The studied data contains demographic and socioeconomic events, where the events are presented as sequences of statuses. The involved demographers are interested in applications of advanced ML techniques and interpretable patterns for their needs. We show how Shapley value-based explanations can be ...
Added: September 10, 2022
Прогнозирование энергопотребления на основе автоматического машинного обучения
Danilov K., Автоматизация. Современные технологии 2020 Т. 74 № август 2020 С. 402–407
Рассмотрена задача прогнозирования энергопотребления на основе автоматического машинного обучения. Приведена схема процесса автоматического создания и применения модели прогнозирова ния. Предлагаемый подход апробирован на основе данных о потреблении электроэнергии в регионах России. Проведённый вычислительный эксперимент показал высокую эффективность разработан ной модели. Точность прогнозирования составила 97...99 %. ...
Added: June 13, 2022
Decision Concept Lattice vs. Decision Trees and Random Forests
Dudyrev E., Kuznetsov S., , in: Formal Concept Analysis: 16th International Conference, ICFCA 2021, Strasbourg, France, June 29 – July 2, 2021, Proceedings.: Springer, 2021. Ch. 16 P. 252–260.
Added: September 28, 2021
SGLB: Stochastic Gradient Langevin Boosting
Ustimenko A., Prokhorenkova L., , in: Proceedings of the 38th International Conference on Machine Learning (ICML 2021)Vol. 139.: PMLR, 2021. P. 1–10.
This paper introduces Stochastic Gradient Langevin Boosting (SGLB) - a powerful and efficient machine learning framework that may deal with a wide range of loss functions and has provable generalization guarantees. The method is based on a special form of the Langevin diffusion equation specifically designed for gradient boosting. This allows us to theoretically guarantee ...
Added: August 6, 2021
Uncertainty in Gradient Boosting via Ensembles
Малинин А. А., Prokhorenkova L., Ustimenko A., , in: Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). ICLR, 2021.: ICLR, 2021..
Added: August 2, 2021
Boost then Convolve: Gradient Boosting Meets Graph Neural Networks
Ivanov S., Prokhorenkova L., , in: Proceedings of the 9th International Conference on Learning Representations (ICLR 2021). ICLR, 2021.: ICLR, 2021..
Added: August 2, 2021
Comparison of Machine Learning Methods for Life Trajectory Analysis in Demography
Muratova A., Mitrofanova E., Islam R., , in: Intelligent Information and Database Systems: 13th Asian Conference, ACIIDS 2021, Phuket, Thailand, April 7–10, 2021, Proceedings.: Springer, 2021. P. 630–642.
Added: April 6, 2021
StochasticRank: Global Optimization of Scale-Free Discrete Functions
Liudmila Prokhorenkova, Ustimenko A., , in: International Conference on Machine Learning (ICML 2020)Vol. 119.: PMLR, 2020. P. 9669–9679.
In this paper, we introduce a powerful and efficient framework for direct optimization of ranking metrics. The problem is ill-posed due to the discrete structure of the loss, and to deal with that, we introduce two important techniques: stochastic smoothing and novel gradient estimate based on partial integration. We show that classic smoothing approaches may ...
Added: January 14, 2021
Interpretable machine learning for demand modeling with high-dimensional data using Gradient Boosting Machines and Shapley values
Antipov E. A., Pokryshevskaya E. B., Journal of Revenue and Pricing Management 2020 No. 19 P. 355–364
Forecasting demand and understanding sales drivers are one of the most important tasks in retail analytics. However, traditionally, linear models and/or models with a small number of predictors have been predominantly used in sales modeling. Taking into account that real-world demand is naturally determined by complex substitution and complementation patterns among a large number of ...
Added: October 31, 2020
CatBoost: unbiased boosting with categorical features
Liudmila Prokhorenkova, Gusev G., Vorobev A. et al., , in: Advances in Neural Information Processing Systems 31 (NeurIPS 2018).: Neural Information Processing Systems Foundation, 2018. P. 6638–6648.
This paper presents the key algorithmic techniques behind CatBoost, a new gradient boosting toolkit. Their combination leads to CatBoost outperforming other publicly available boosting implementations in terms of quality on a variety of datasets. Two critical algorithmic advances introduced in CatBoost are the implementation of ordered boosting, a permutation-driven alternative to the classic algorithm, and ...
Added: May 1, 2020
Attribution of Customers’ Actions Based on Machine Learning Approach
Timur Kadyrov, Ignatov D. I., , in: Proceedings of the Fifth International Workshop on Experimental Economics and Machine Learning (EEML 2019),Perm, Russia, September 26, 2019Vol. 2479.: CEUR Workshop Proceedings, 2019. P. 77–88.
A multichannel attribution model based on gradient boost- ing over trees is proposed, which was compared with the state of the art models: bagged logistic regression, Markov chains approach, shapely value. Experiments on digital advertising datasets showed that the pro- posed model is better than the solutions considered by ROC AUC metric. In addition, the ...
Added: January 20, 2020
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit