Searching for Interpretable Demographic Patterns
Nowadays there is a large amount of demographic data which should be analyzed and interpreted. From accumulated demographic data, more useful information can be extracted by applying modern methods of data mining. Two kinds of experiments are considered in this work: 1) generation of additional secondary features from events and evaluation of its influence on accuracy; 2) exploration of features influence on classification result using SHAP (SHapley Additive exPlanations). An algorithm for creating secondary features is proposed and applied to the dataset. The classifications were made by two methods, SVM and neural networks, and the results were evaluated. The impact of events and features on the classification results was evaluated using SHAP; it was demonstrated how to tune model for improving accuracy based on the obtained values. Applying convolutional neural network for sequences of events allowed improve classification accuracy and surpass the previous best result on the studied demographic dataset.