?
Random forests with parametric entropy-based information gains for classification and regression problems
The random forest algorithm is one of the most popular and commonly used algorithms
for classification and regression tasks. It combines the output of multiple decision trees
to form a single result. Random forest algorithms demonstrate the highest accuracy on
tabular data compared to other algorithms in various applications. However, random
forests and, more precisely, decision trees, are usually built with the application of
classic Shannon entropy. In this article, we consider the potential of deformed entropies,
which are successfully used in the field of complex systems, to increase the prediction
accuracy of random forest algorithms. We develop and introduce the information gains
based on Renyi, Tsallis, and Sharma-Mittal entropies for classification and regression
random forests. We test the proposed algorithm modifications on six benchmark
datasets: three for classification and three for regression problems. For classification
problems, the application of Renyi entropy allows us to improve the random forest
prediction accuracy by 19-96% in dependence on the dataset, Tsallis entropy improves
the accuracy by 20-98%, and Sharma-Mittal entropy improves accuracy by 22-111%
compared to the classical algorithm. For regression problems, the application of
deformed entropies improves the prediction by 2-23% in terms of R2 in dependence
on the dataset.