Variational Inference for Sequential Distance Dependent Chinese Restaurant Process

S. Bartunov; D. Vetrov

?

Variational Inference for Sequential Distance Dependent Chinese Restaurant Process

P. 1404–1412.

Bartunov S., Vetrov D.

Recently proposed distance dependent Chinese Restaurant Process (ddCRP) generalizes extensively used Chinese Restaurant Process (CRP) by accounting for dependencies between data points. Its posterior is intractable and so far only MCMC methods were used for inference. Because of very different nature of ddCRP no prior developments in variational methods for Bayesian nonparametrics are appliable. In this paper we propose novel variational inference for important sequential case of ddCRP (seqddCRP) by revealing its connection with Laplacian of random graph constructed by the process. We develop efficient algorithm for optimizing variational lower bound and demonstrate its efficiency comparing to Gibbs sampler. We also apply our variational approximation to CRP-equivalent seqddCRP-mixture model, where it could be considered as alternative to one based on truncated stick-breaking representation. This allowed us to achieve significantly better variational lower bound than variational approximation based on truncated stick breaking for Dirichlet process.

Language: English

Full text

Text on another site

Keywords: кластеризация clustering Variational inference bayesian nonparametric methods вариационный вывод непараметрические байесовские методы

In book

JMLR Workshop and Conference Proceedings

Issue 32: Proceedings of The 31st International Conference on Machine Learning. , Beijing: Microtome Publishing, 2014.

A Clustering Model for Stocks that Considers Hidden Dynamics and Price Trajectory

Sizykh N., Sizykh D., Morychev G., IEEE Access 2025 Vol. 13 P. 213194–213210

One of the main tools for analyzing large volumes of financial data is the use of clustering methods and models, which allow the identification of various patterns. This study examines the problem of clustering time series that reflect the behavior of prices, yields, modes, trends, and a number of related stock indicators. The relevance and ...

Added: February 3, 2026

Flexible Stock Market Algorithm

Rubchinskiy A., Chubarova D., Technology and Investment 2025 Vol. 16 No. 4 P. 211–240

The article considers one of the most famous examples of socio-economic systems characterized by significant uncertainty—the S&P-500 stock market, where shares of 500 largest US companies are traded. The flexible algorithm for daily trading has been developed. It is based on known fixed data about cost of shares in previous days as well as on ...

Added: December 19, 2025

Эффективный алгоритм торговли на фондовом рынке: ретроспективный анализ, основанный на данных по S&P-500.

Rubchinskiy A., Chubarova D., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2025. No. WP7/2025/01.

The article examines one of the most famous examples of socio-economic systems, characterized by significant uncertainty – the S&P-500 stock market, where shares of 500 largest US companies are traded. No assumptions are made about the probabilistic characteristics of the stock market. A flexible algorithm for daily trading has been developed, based on both known fixed data ...

Added: November 9, 2025

Computer tools in mental disorders diagnostics by oral speech

Khomenko A., Komratova A., Isakov D. et al., , in: Computational linguistics and intellectual technologies. Papers from the Annual International Conference "Dialogue" (2025)Vol. 23.: [б.и.], 2025. P. 147–157.

The integration of automated speech analysis in diagnosing mental health disorders is becoming increasingly significant in both clinical and computational linguistics. This study aims to construct linguistic profiles for individuals with neurocognitive and affective mental disorders. Using speech transcriptions and relevant to the study computational techniques like lexical clustering and stylostatistical analysis, this research looks ...

Added: October 19, 2025

ОТСЛЕЖИВАНИЕ РАЗВИТИЯ РАЗРУШЕНИЯ С ПОМОЩЬЮ КЛАСТЕРИЗАЦИИ ИМПУЛЬСОВ ТЕРМИЧЕСКИ СТИМУЛИРОВАННОЙ АКУСТИЧЕСКОЙ ЭМИССИИ ПРИ ОТСУТСТВИИ ЛОКАЦИИ

Индаков Г. С., Казначеев П. А., Майбук З. Я. et al., Геофизические исследования 2025 Т. 26 № 2 С. 99–124

The paper studies the clusterability of acoustic emission pulses during high-temperature heating of sandstone sample preliminarily subjected to mechanical loading. Mechanical loading was applied in uniaxial mode up to load close to destructive with appearance of signs of large cracks on the surface. After that, samples were subjected to thermal treatment up to 650 °C ...

Added: September 19, 2025

Анализ тематики повседневных разговоров: экспертный подход и автоматические методы

Sherstinova T., Вепринцева Д. А., Человек: образ и сущность. Гуманитарные аспекты 2025 № 2(62) С. 89–108

В статье рассматриваются три разных подхода к изучению тематики повседневных разговоров: экспертная тематическая разметка и два автоматических метода (тематическое моделирование и кластеризация). Материалом для исследования послужили расшифровки русской устной повседневной речи из корпуса ОРД, подготовленные на основе звукозаписей спонтанных разговоров, выполненных в естественных коммуникативных ситуациях (дома, на работе, в учебном заведении, в магазине, в поликлинике ...

Added: September 3, 2025

Maksimov A. G., Telezhkina M., / NRU Higher School of Economics. Series EC "Economics". 2024. No. 271.

The paper examines similarity of models with structural changes among heterogeneous panel data units. We propose applying a cosine metric to compare angles between vectors of weighted coefficients as a measure of closeness of economic models. Testing whether the cosine metric value is zero against nonzero, positive, and negative alternatives enriches traditional testing results. The ...

Added: March 10, 2025

Метод туннельной кластеризации

Aleskerov F. T., Myachin A. L., Yakuba V. I., Доклады Российской академии наук. Математика, информатика, процессы управления (ранее - Доклады Академии Наук. Математика) 2024 Т. 520 № 1 С. 29–34

Предлагается новый метод быстрого поиска закономерностей в числовых данных большой раз-мерности, названный “туннельной кластеризацией”. Основными преимуществами нового методаявляются: относительно невысокая вычислительная сложность; эндогенное определение составаи количества кластеров; высокая степень интерпретируемости конечных результатов. Приведеноописание трех различных вариаций: с фиксированными гиперпараметрами, адаптивными, а так-же комбинированный подход. Рассмотрены три основных свойства туннельной кластеризации.Практическое применение приведено как на синтетических ...

Added: March 3, 2025

Tunnel Clustering Method

F. T. Aleskerov, A. L. Myachin, V. I. Yakuba, Doklady Mathematics 2024 Vol. 110 No. 3 P. 474–479

We propose a novel method for rapid pattern analysis of high-dimensional numerical data, termed tunnel clustering. The main advantages of the method are its relatively low computational complexity, endogenous determination of cluster composition and number, and a high degree of interpretability of final results. We present descriptions of three different variations: one with fixed hyperparameters, ...

Added: March 3, 2025

Использование Z-чисел для описания набора данных

Гусейнов О., Degtyarev K. Y., IRETC MTÜ PAHTEI - Proceedings of Azerbaijan High Technical Educational Institutions 2025 Т. 48 № 1 С. 360–370

The concept of Z-number was proposed by Prof. Lotfi Zadeh to describe partial reliability of information, and it is a kind of fusion of fuzziness and probabilistic uncertainty. Z-number can be presented as a pair of fuzzy numbers Z(A,B) used to describe a value of a random variable X. The first component (A) is a ...

Added: February 20, 2025

Gradient descent clustering with regularization to recover communities in transformed attributed networks

Shalileh S., Social Network Analysis and Mining 2025 Vol. 15212 P. 137–148

Community detection in attributed networks aims to recover clusters in which the within-community nodes are as interconnected and as homogeneous as possible, while the between-communities nodes are as disconnected and as heterogeneous as possible. The current research proposes a straightforward data-driven model with an integrated regularization term to recover communities. For further improvement of the ...

Added: November 30, 2024

An empirical scrutinization of four crisp clustering methods with four distance metrics and one straightforward interpretation rule

T. A. Alvandyan, S. Shalileh, Doklady Mathematics 2024 Vol. 110 No. S1 P. S236–S250

Clustering has always been in great demand by scientific and industrial communities. However, due to the lack of ground truth, interpreting its obtained results can be debatable. The current research provides an empirical benchmark on the efficiency of three popular and one recently proposed crisp clustering methods. To this end, we extensively analyzed these (four) ...

Added: November 30, 2024

«Уходя — уходи»: кто остается с Россией и как перераспределяется импорт?

Gnidchenko A., Mikheeva O. M., Salnikov V., Вопросы экономики 2023 № 12 С. 48–65

We examine the division of countries according to their political attitude towards Russia after the launch of a special military operation in Ukraine and the introduction of large-scale sanctions, and illustrate the importance of sanctions and political attitude to Russia for countries’ exports there with the available statistical data. The countries are clustered by their ...

Added: October 28, 2024