Share of Toxic Comments among Different Topics: The Case of Russian Social Networks
With the widespread use of online social networks, it is becoming more and more difficult to monitor and analyse all the user-generated content. Toxic speech in online conversations should be treated as a matter with serious social gravity, since it may result in both negative impacts on mental health and violent actions in the physical world. Within this study, we identified the share of toxic comments among different topics in Russianlanguage comments from social network Pikabu. Firstly, for toxic comments classification, we manually labelled the training dataset and fine-tuned several language models. To provide further toxic comments studies with strong classification baselines, we made our pre-trained publicly available to the research community. Secondly, we proposed an approach for topics labelling based on six major objective and observable dimensions for objective wellbeing measurement used by intergovernmental and government organisations. Lastly, we conducted an analysis of Pikabu data. We found that the largest share of toxic comments was under posts about politics, while security and socioeconomic topics ranked second and third, and the rest of the topics showed roughly the same values.