• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Book chapter

Latent Dirichlet Allocation: Stability and Applications to Studies of User-Generated content

P. 161-165.

Topic modeling, in particular the Latent Dirichlet Allocation (LDA) model, has recently emerged as an important tool for understanding large datasets, in particular, user-generated datasets in social studies of the Web. In this work, we investigate the instability of LDA inference, propose a new metric of similarity between topics and a criterion of vocabulary reduction. We show the limitations of the LDA approach for the purposes of qualitative analysis in social science and sketch some ways for improvement.