Стихи и проза через призму дистрибутивной семантики
Positive mental health is considered to be a significant predictor of health and longevity; however, our understanding of the ways in which this important characteristic is represented in users’ behavior on social networking sites is limited. The goal of this study was to explore associations between positive mental health and language used in online communication in a large sample of Russian Facebook users. The five-item World Health Organization Well-Being Index (WHO-5) was used as a self-report measure of well-being. Morphological, sentiment, and semantic analyses were performed for linguistic data. The total of 6,724 participants completed the questionnaire and linguistic data were available for 1,972. Participants’ mean age was 45.7 years (SD = 11.6 years); 73.4% were female. The dataset included 15,281 posts, with an average of 7.67 (SD = 5.69) posts per participant. Mean WHO-5 score was 60.0 (SD = 19.1), with female participants exhibiting lower scores. Use of negative sentiment words and impersonal predicates (“should statements”) demonstrated an inverse association with the WHO-5 scores. No significant correlation was found between the use of positive sentiment words and the WHO-5 scores. This study expands current understanding of the association between positive mental health and language use in online communication by employing data from a non-Western sample.
The goal of the current work is to evaluate semantic feature aggregation techniques in a task of gender classification of public social media texts in Russian. We collect Facebook posts of Russian-speaking users and apply them as a dataset for two topic modelling techniques and a distributional clustering approach. The output of the algorithms is applied as a feature aggregation method in a task of gender classification based on a smaller Facebook sample. The classification performance of the best model is favorably compared against the lemmas baseline and the state-of-the-art results reported for a different genre or language. The resulting successful features are exemplified, and the difference between the three techniques in terms of classification performance and feature contents are discussed, with the best technique clearly outperforming the others.
The goal of this paper was to assess the connection between dark personality traits and engagement in harmful online behaviors in a sample of Russian Facebook users, and to describe the language they use in online communication. A total of 6724 individuals participated in the study (mean age = 44.96 years, age range: 18–85 years, 77.9% — female). Data was collected via a purpose-built application, which served two purposes: administer the survey and download consenting user's public wall posts, gender and age from the Facebook profile. The survey included questions on engagement in harmful online behaviors and the Short Dark Triad scale; 15,281 wall posts from 1972 users were included in the dataset. These posts were subjected to morphological, lexical and semantic analyses. More than 25% of the sample reported engaging in harmful online behaviors. Males were more likely to send insulting or threatening messages and post aggressive comments; no gender differences were found for disseminating other people's private information. Psychopathy and male gender were the unique predictors of engagement in harmful online behaviors. A number of significant correlations were found between the dark traits and numeric, lexical, morphological and semantic characteristics of the participants' posts.