Additive Regularization for Hierarchical Multimodal Topic Modeling

N. A. Chirkova; K. V. Vorontsov

doi:10.21469/22233792.2.2.05

Publications

?

Additive Regularization for Hierarchical Multimodal Topic Modeling

Journal of machine learning and data analysis. 2016. Vol. 2. No. 2. P. 187–200.

N. A. Chirkova, K. V. Vorontsov

Probabilistic topic models uncover the latent semantics of text collections and represent each document by a multinomial distribution over topics. Hierarchical models divide topics into subtopics recursively, thus simplifying information retrieval, browsing and understanding of large multidisciplinary collections. The most of existing approaches to hierarchy learning rely on Bayesian inference. This makes difficult the incorporation of topical hierarchies into other types of topic models. The authors use non-Bayesian multicriteria approach called Additive Regularization of Topic Models (ARTM), which enables to combine any topic models formalized via log-likelihood maximization with additive regularization criteria. In this work, such formalization is proposed for topical hierarchies. Hence, the hierarchical ARTM (hARTM) can be easily adapted to a wide class of text mining problems, e. g., for learning topical hierarchies from multimodal and multilingual heterogeneous data of scientific digital libraries or social media. The authors focus on topical hierarchies that allow a topic to have several parent topics which is important for multidisciplinary collections of scientific papers. The regularization approach allows one to control the sparsity of the parent–child relation and automatically determine the number of subtopics for each topic. Before learning the hierarchy, it is necessary to fix the number of topics for each layer. The additive regularization does not complicate the learning algorithm; so, this approach is well scalable on large text collections.

Research target: Mathematics

Priority areas: IT and mathematics

Keywords: тематическое моделирование hierarchical models иерархические модели аддитивная регуляризация probabilistic topic modeling Additive Regularization Of Topic Models

Sub-Riemannian geodesics on the Heisenberg 3D nil-manifold.

Glutsyuk A., Sachkov Y., Nonlinearity 2025 Vol. 38 Article 115013

We study the projection of the left-invariant sub-Riemannian structure on the 3D Heisenberg group G to the Heisenberg 3D nil-manifold M — the compact homogeneous space of G by the discrete Heisenberg group. First we describe dynamical properties of the geodesic flow for M: periodic and dense orbits, a dynamical characterization of the normal Hamiltonian ...

Added: January 27, 2026

Теория графов

Дистель Р., М.: МЦНМО, 2024.

С момента выхода первого издания на английском языке в 1997 году книга известного математика, профессора Гамбургского университета Рейнгарда Дистеля стала основным учебником по теории графов во многих университетах, выдержав к настоящему времени пять изданий, перевод последнего из которых предлагается читателю. Уникальность учебника в его глубине при относительно небольшом объёме: в книге найдутся задачи как доступные ...

Added: January 25, 2026

Conceptual Knowledge Structures First International Joint Conference, CONCEPTS 2024, Cádiz, Spain, September 9–13, 2024, Proceedings

Obiedkov S., Switzerland: Springer, 2024.

This book constitutes the proceedings of the First International Joint Conference on Conceptual Knowledge Structures, CONCEPTS 2024, which took place in Cádiz, Spain, during September 9-13, 2024. The conference is an amalgamation of the 18th International Conference on Formal Concept Analysis (ICFCA); the 17th International Conference on Concept Lattices and Their Applications (CLA); and the 28th ...

Added: January 23, 2026

Cooperative games with fuzzy characteristic functions on concept lattices

Kemgne M. W., Njionou B. B., Ignatov D. I. et al., International Journal of Approximate Reasoning 2025 Vol. 186 P. 1–18

This paper introduces cooperative games with transferable utilities and fuzzy characteristic functions on concept lattices. While previous works have independently addressed games with fuzzy payoffs and games restricted to structured coalition systems such as lattices, our approach combines both perspectives. We consider cooperative settings where coalition formation is constrained by a concept lattice structure, and ...

Added: January 23, 2026

Run time dynamic digital twins and dynamic digital twins networks

Vodyaho A., Delhibabu R., Ignatov D. I. et al., Future Generation Computer Systems 2025 Vol. 172 P. 1–18

Digital twins are widely used for building various types of cyber–physical systems. There are a huge number of publications devoted to the use of digital twins in production systems. Much less attention is paid to the issues of building runtime digital twins. The article describes an approach to building complex distributed cyber–physical systems with a ...

Added: January 23, 2026

Non-invertible quasihomogeneous singularities and their Landau-Ginzburg orbifolds

Rarovskii A., Journal of Singularities 2025 Vol. 28 P. 217–233

Based on the classification of quasihomogeneous singularities, any polynomial $f$ defining such a singularity can be decomposed as f = f_\kappa + f_{add}. The polynomial f_\kappa takes a specific form, whereas f_{add} is constrained only by the requirement that the singularity of f should be isolated. The polynomial f_{add} is zero if and only if ...

Added: January 22, 2026

Wave packet dynamics within the modular Schamel equation

Flamarion M., Pelinovsky E., Didenkulova E., Physica D: Nonlinear Phenomena 2026 Vol. 488 Article 135095

In this article, we investigate the evolution of long waves and the dynamics of wave packets governed by the modular Schamel equation. We show that the wave field disintegrates into solitary waves of both polarities and recurrence is not observed. Their interactions substantially amplify the wave field and, over long times and large domains, can trigger the formation ...

Added: January 22, 2026

Blurred Magnitude Homology of Functional Connectome for ASD Diagnosis

Alexander Kachura, Vsevolod Chernyshev, Kachan O. et al., Frontiers in Psychiatry 2026 Vol. 16 Article 1677282

Autism spectrum disorder (ASD) is one of the most common neurodevelopmental disorders. Existing studies show that adults with ASD may experience accelerated or altered neurocognitive aging. Consequently, cognitive decline in people with ASD can be delayed if timely measures are taken to treat this disorder. This study focuses on the development of a new algorithm ...

Added: January 21, 2026

19th Annual Conference, TAMC 2025, Jinan, China, September 19–21, 2025, Proceedings. Theory and Applications of Models of Computation. Lecture Notes in Computer Science (LNCS, volume 16084)

Springer, 2026.

This book constitutes the proceedings of the 19th Annual Conference on Theory and Applications of Models of Computation, TAMC 2025, which was held in Jinan, China, during September 19–21, 2025. ...

Added: January 20, 2026

Draft genome sequence of Aspergillus ochraceus VKM-F4104D (L-1) producing peptidase activator of protein C

Shestakova A., Surkova D., Фаткулин А. et al., Microbiology Resource Announcements 2024 Vol. 13 No. 11 P. 1–3

Aspergillus ochraceus VKM-F4104D (L-1) is a saprotrophic fungus isolated from buried soils of Phanagoria (Russia). This strain is known as a producer of the fibrinolytic peptidase-activating plasma protein C. We have sequenced and assembled its genome for a more detailed understanding of the fungus’ physiology and encoded peptidases. ...

Added: January 20, 2026

Комплексная биномиальная теорема и пентагональные тождества

Белоусов Н. М., Саркисян Г. А., Spiridonov V., Теоретическая и математическая физика 2026 Т. 226 № 1 С. 3–26

We consider different pentagon identities realized by the hyperbolic hypergeometric functions and investigate their degenerations to the level of complex hypergeometric functions. In particular, we show that one of the degenerations yields the complex binomial theorem which coincides with the Fourier transformation of the complex Euler beta integral {evaluation.} At the bottom we obtain a Fourier transformation formula for the ...

Added: January 20, 2026

Complex rational Ruijsenaars model. The two-particle case

Белоусов Н. М., Саркисян Г. А., Spiridonov V., Natural Science Review 2025 Vol. 2 No. 5 Article 100503

We consider a complex rational degeneration of the hyperbolic Ruijsenaars model emerging in the limit $\omega_1+\omega_2\to 0$ (or $b\to i$ in 2d CFT) and investigate in detail the two-particle case. Corresponding wave functions are described by complex hypergeometric functions in the Mellin-Barnes representation. Their dual integral representation and reflection symmetry in the coupling constant are established. Besides, a complex limit of ...

Added: January 20, 2026

Mixing for dynamical systems driven by stationary noise

Kuksin S., Shirikyan A., Geometric and Functional Analysis 2025 Vol. 35 P. 1346–1399

The paper deals with the problem of long-time asymptotic behaviour of solutions for classes of ODEs and PDEs, perturbed by stationary noises. The latter are not assumed to be δ-correlated in time, therefore the evolu- tion in question is not necessarily Markovian. We first prove an abstract result which implies the mixing for random dynamical systems ...

Added: January 19, 2026

Computer-aided system for assessing and selecting effective masters' learning trajectory in variability of external factors considering the university industrial partners' opinion

A. V. Vishnekov, E. M. Ivanova, N. Zhursunova, Информатика и образование 2025 Vol. 40 No. 6 P. 39–48

The modern education system development is characterized by many uncertain, dynamically changing factors. The purpose and originality of the study presented in the article is to develop an automated system for building an effective educational trajectory in the conditions of external factors’ uncertainty. The developed system is a tool for assessment and dynamic adjustment of ...

Added: January 16, 2026

Iterative Ricci-Foster Curvature Flow with GMM-Based Edge Pruning: A Novel Approach to Community Detection

Sorokin K., Beketov M., Онучин А. et al., / arxiv.org. Серия cs.SI "Social and Information Networks ". 2025.

Community detection in complex networks is a fundamental problem, open to new approaches in various scientific settings. We introduce a novel community detection method, based on Ricci flow on graphs. Our technique iteratively updates edge weights (their metric lengths) according to their (combinatorial) Foster version of Ricci curvature computed from effective resistance distance between the ...

Added: January 15, 2026

Implementing Transport Coding in OMNeT++ for Message Delay Reduction

Petrovanov I., Sergeev A., / Series Computer Science "arxiv.org". 2025. No. 2512.18332.

Transport coding reduces message delay in packet-switched networks by introducing controlled redundancy at the transport layer: original packets are encoded into coded packets, and the message is reconstructed after the first successful deliveries, effectively shifting latency from the maximum packet delay to the -th order statistic. We present a concise, reproducible discrete-event implementation of transport coding in OMNeT++, including ...

Added: December 24, 2025

Hessian-based lightweight neural network for brain vessel segmentation on a minimal training dataset

Меньшиков И. А., Бернадотт А. К., Elvimov N. S., / Series arXie "Statistical mechanics". 2025.

Accurate segmentation of blood vessels in brain magnetic resonance angiography (MRA) is essential for successful surgical procedures, such as aneurysm repair or bypass surgery. Currently, annotation is primarily performed through manual segmentation or classical methods, such as the Frangi filter, which often lack sufficient accuracy. Neural networks have emerged as powerful tools for medical image ...

Added: December 1, 2025

Determining the boundary of dynamical chaos in the generalized Chirikov map via machine learning

Чернышов Д. П., Satanin A., Shchur L., / Series arXiv "math". 2025.

We investigate the boundary separating regular and chaotic dynamics in the generalized Chirikov map, an extension of the standard map with phase-shifted secondary kicks. Lyapunov maps were computed across the parameter space (K,K(α, τ)) and used to train a convolutional neural network (ResNet18) for binary classification of dynamical regimes. The model reproduces the known critical ...

Added: November 21, 2025

Эффективный алгоритм торговли на фондовом рынке: ретроспективный анализ, основанный на данных по S&P-500.

Rubchinskiy A., Chubarova D., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2025. No. WP7/2025/01.

The article examines one of the most famous examples of socio-economic systems, characterized by significant uncertainty – the S&P-500 stock market, where shares of 500 largest US companies are traded. No assumptions are made about the probabilistic characteristics of the stock market. A flexible algorithm for daily trading has been developed, based on both known fixed data ...

Added: November 9, 2025

Diffusion on language model embeddings for protein sequence generation

Meshchaninov V., Strashnov, P., Shevtsov A. et al., / Cornell University. Серия CoRR, arXiv:2403.03726 "Computing Research Repository,". 2025.

Protein design requires a deep understanding of the inherent complexities of the protein universe. While many efforts lean towards conditional generation or focus on specific families of proteins, the foundational task of unconditional generation remains underexplored and undervalued. Here, we explore this pivotal domain, introducing DiMA, a model that leverages continuous diffusion on embeddings derived ...

Added: October 5, 2025

Smoothie: Smoothing Diffusion on Token Embeddings for Text Generation

Shabalin A., Meshchaninov V., Vetrov D., / Series cs.CL, arXiv:2505.18853 "Computation and Language". 2025.

Diffusion models have achieved state-of-the-art performance in generating images, audio, and video, but their adaptation to text remains challenging due to its discrete nature. Prior approaches either apply Gaussian diffusion in continuous latent spaces, which inherits semantic structure but struggles with token decoding, or operate in categorical simplex space, which respect discreteness but disregard semantic ...

Added: October 5, 2025

Compressed and Smooth Latent Space for Text Diffusion Modeling.

Meshchaninov V., Chimbulatov E., Shabalin A. et al., / Series cs.CL, arXiv:2506.21170 "Computation and Language". 2025.

Autoregressive language models dominate modern text generation, yet their sequential nature introduces fundamental limitations: decoding is slow, and maintaining global coherence remains challenging. Diffusion models offer a promising alternative by enabling parallel generation and flexible control; however, their application to text generation is hindered by the high dimensionality of token-level representations. We introduce COSMOS, a ...

Added: October 5, 2025

A Feature Engineering Framework for Computer Vision Based on Topological Data Analysis

Абрамов А. С., Chernyshev V. L., Mikhaylets E. et al., / Series Social Science Research Network "Social Science Research Network". 2025.

Computer vision is one of the most relevant modern research areas with broad practical applications. However, traditional solutions based on deep learning have signicant limitations and can be misleading. Topological data analysis, on the other hand, is a modern approach to solving similar problems using mathematically deterministic methods of algebraic topology that reduce the risk ...

Added: September 23, 2025

On the construction of frieze patterns from partitions of convex polygons by nonintersecting diagonals

Kochetkov Y., / Series arXiv.org e-print archive "arXiv.math". 2025. No. 07600.

We demonstrate in an elementary way how to construct a frieze pattern of width m-3 from a partition of a convex m-gon by not intersecting diagonals. ...

Added: September 17, 2025