• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Article

Fractal approach for determining the optimal number of topics in the field of topic modeling

Journal of Physics: Conference Series. 2019. Vol. 1163. No. 1. P. 1-6.
Ignatenko V., Sergei Koltcov, Staab S., Boukhers Z.

In the framework of this paper we apply multifractal formalism to the analysis of
statistical behaviour of topic models under variation of the number of topics. Fractal analysis
of topic models allows to show that self-similar fractal clusters exist in large textual collections.
We provide numerical results for 3 topic models (PLSA, ARTM, LDA Gibbs sampling) on
2 datasets, namely, on an English-language dataset and on a Russian-language dataset. We
demonstrate that forming of clusters occurs precisely in the transition regions. Linear regions
do not lead to changes in fractals, therefore, it is sufficient to find transition regions for the
study of textual collections. Accordingly, the problem of the analysing the evolution of topic
models can be reduced to the problem of searching transition regions in topic models.