Toxic Communication on Twitch.tv. Effect of a Streamer
This paper investigates on how spectators communication is organized in chats during broadcasts on Twitch.tv with the main focus on toxic communication. The main purpose of the paper is to understand how socio-demographic characteristics of a broadcaster and channel settings which broadcaster can control affect communication in a chat. Chat logs from Twitch.tv channels were used to create a topic model of viewers discussions. The result of regression analysis indicates that socio-demographic characteristics of a broadcaster have a statistically significant effect on the type of communication, which is manifested in chat.
Using video on the Internet has become a common practice, but the television-like ‘passive viewer’ approach misses the benefits of the interactive nature of the Internet. The technological limitations of television can be overridden by the Internet. Having multiple sources of input does not mean they should be merged into one editor-controlled flat output. Treating streams as objects, it is possible to make viewers editors for their screens whenever they want, or let them watch a pre-edited version. Active streams are distributed to viewers to gain control over the scene layout. Recorded scenes can be remastered whenever needed and represented in different views simultaneously. For lectures and conference recordings, inline slide browsing is also possible. This approach was successfully tested in the Viditory.net project for the broadcasting and recording of conferences with multi-camera shots and remote speakers. Despite the Adobe Flash platform becoming obsolete, it is possible to implement similar capabilities on modern platforms and by using modern technologies.
Probabilistic topic modeling of text collections is a powerful tool for statistical text analysis. In this tutorial we introduce a novel non-Bayesian approach, called Additive Regularization of Topic Models. ARTM is free of redundant probabilistic assumptions and provides a simple inference for many combined and multi-objective topic models.
Probabilistic topic modeling of text collections has been recently developed mainly within the framework of graphical models and Bayesian inference. In this paper we introduce an alternative semi-probabilistic approach, which we call additive regularization of topic models (ARTM). Instead of building a purely probabilistic generative model of text we regularize an ill-posed problem of stochastic matrix factorization by maximizing a weighted sum of the log-likelihood and additional criteria. This approach enables us to combine probabilistic assumptions with linguistic and problem-specific requirements in a single multi-objective topic model. In the theoretical part of the work we derive the regularized EM-algorithm and provide a pool of regularizers, which can be applied together in any combination. We show that many models previously developed within Bayesian framework can be inferred easier within ARTM and in some cases generalized. In the experimental part we show that a combination of sparsing, smoothing, and decorrelation improves several quality measures at once with almost no loss of the likelihood.
An important text mining problem is to find, in a large collection of texts, documents related to specic topics and then discern further structure among the found texts. This problem is especially important for social sciences, where the purpose is to nd the most representative documents for subsequent qualitative interpretation. To solve this problem, we propose an interval semi-supervised LDA approach, in which certain predened sets of keywords (that dene the topics researchers are interested in) are restricted to specic intervals of topic assignments. We present a case study on a Russian LiveJournal dataset aimed at ethnicity discourse analysis.
The aim of this article is to analyze the discursive background for the characters of teachers in the Soviet school story of the afterwar period. The 1,8 million words corpus for the study was compiled of the novels about school and schooling by 37 authors, written in 1940-s — 1980-s. The contents of the episodes where the keywords (headmaster, deputy headmaster, teacher, female teacher) were mentioned was analyzed automatically with the help of probabilistic topic modeling (LDA). Topics significantly more or less common in these episodes than in the whole corpus were used to characterize discursive context for the keywords. Judging by the thematic profile the term ‘female teacher’ is opposed to all the rest, Meaningful contrasts distinguishing the thematic ptofiles of the terms are: disourse of the upbringing and everyday schooling, komsomol and pioneers, emotions and gender.
An important text mining problem is to find, in a large collection of texts, documents related to specific topics and then discern further structure among the found texts. This problem is especially important for social sciences, where the purpose is to find the most representative documents for subsequent qualitative interpretation. To solve this problem, we propose an interval semi-supervised LDA approach, in which certain predefined sets of keywords (that define the topics researchers are interested in) are restricted to specific intervals of topic assignments.
Probabilistic topic models discover a low-dimensional interpretable representation of text corpora by estimating a multinomial distribution over topics for each document and a multinomial distribution over terms for each topic. A unied family of expectation-maximization (EM) like algorithms with smoothing, sampling, sparsing, and robustness heuristics that can be used in any combinations is considered. The known models PLSA (probabilistic latent semantic analysis), LDA (latent Dirichlet allocation), SWB (special words with background), as well as new ones can be considered as special cases of the presented broad family of models. A new simple robust algorithm suitable for sparse models that do not require to estimate and store a big matrix of noise parameters is proposed. The present authors nd experimentally optimal combinations of heuristics with sparsing strategies and discover that sparse robust model without Dirichlet smoothing performs very well and gives more than 99% of zeros in multinomial distributions without loss of perplexity.