Using a Cascade of Supervised Machine Learning Models to Discover Causality in Pairs of Variables

Y. Zelenkov

doi:10.1007/978-3-032-22051-6_10

Publications

?

Using a Cascade of Supervised Machine Learning Models to Discover Causality in Pairs of Variables

P. 187–205.

Zelenkov Y.

Discovering causality between two variables is challenging, especially in the nonlinear case. Many causation coefficients exist to identify cause and effect, but most only distinguish two types of relationships, and , treating all other cases as unidentifiable. Additionally, these models often rely on assumptions (e.g., linearity, non-Gaussian noise) that limit their practical applicability. To address these limitations, many authors adopt a supervised approach, using causation coefficients as features for machine learning models. We propose a cascade of models, each designed to identify only one type of causality. This approach allows us to focus on extracting the most informative features for different causal types. We introduce two new features: (1) the fraction of the variation explained by the first principal component, and (2) the ratio of the skewness of Xand Y. These features improve the detection of confounded pairs and causal directions. Numerical experiments demonstrate that the proposed method outperforms existing supervised and unsupervised algorithms for continuous variables.

Keywords: causal discovery causal effect pairs causation coefficient

In book

Parallel Computational Technologies, 19th International Conference, PCT 2025, Moscow, Russia, April 8–10, 2025, Revised Selected Papers. (CCIS, volume 2891)

Vol. 2891. , Springer, 2026.