Topic Models Can Improve Domain Term Extraction

Elena Bolshakova; Natalia Loukachevitch; M. Nokel

?

Topic Models Can Improve Domain Term Extraction

P. 684–687.

Elena Bolshakova, Natalia Loukachevitch, Nokel M.

Abstract. The paper describes the results of an experimental study of
topic models applied to the task of single-word term extraction. The
experiments encompass several probabilistic and non-probabilistic topic
models and demonstrate that topic information improves the quality of
term extraction, as well as NMF with KL-divergence minimization is the
best among the models under study.

Language: English

Text on another site

Keywords: кластеризация clustering Topic Models Single-Word Term Extraction тематические модели однословные термины

In book

Proc. 35th European Conference on Information Retrieval (ECIR 2013): Advances in Information Retrieval

Vol. 7814. , Springer, 2013.

О задаче построения децентрализованной интеллектуальной транспортной системы на основе протокола RAFT и кластеризации по сетевому расстоянию

Городничев М. Г., Саксонов Е. А., Кулагин В. П. et al., Вестник Рязанского государственного радиотехнического университета, Российская Федерация 2025 № 94 С. 59–67

The article is devoted to the development and experimental evaluation of a decentralized architecture for an intelligent transport system (ITS) based on the Raft consensus protocol and the network distance metric (RTT) server clustering method. It is shown that existing solutions either require manual configuration and centralized coordination, or are not optimized for latency with ...

Added: March 25, 2026

A Clustering Model for Stocks that Considers Hidden Dynamics and Price Trajectory

Morychev G., Sizykh D., Sizykh N., IEEE Access 2025 Vol. 13 P. 213194–213210

One of the main tools for analyzing large volumes of financial data is the use of clustering methods and models, which allow the identification of various patterns. This study examines the problem of clustering time series that reflect the behavior of prices, yields, modes, trends, and a number of related stock indicators. The relevance and ...

Added: February 3, 2026

Flexible Stock Market Algorithm

Rubchinskiy A., Chubarova D., Technology and Investment 2025 Vol. 16 No. 4 P. 211–240

The article considers one of the most famous examples of socio-economic systems characterized by significant uncertainty—the S&P-500 stock market, where shares of 500 largest US companies are traded. The flexible algorithm for daily trading has been developed. It is based on known fixed data about cost of shares in previous days as well as on ...

Added: December 19, 2025

Эффективный алгоритм торговли на фондовом рынке: ретроспективный анализ, основанный на данных по S&P-500.

Rubchinskiy A., Chubarova D., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2025. No. WP7/2025/01.

The article examines one of the most famous examples of socio-economic systems, characterized by significant uncertainty – the S&P-500 stock market, where shares of 500 largest US companies are traded. No assumptions are made about the probabilistic characteristics of the stock market. A flexible algorithm for daily trading has been developed, based on both known fixed data ...

Added: November 9, 2025

Computer tools in mental disorders diagnostics by oral speech

Khomenko A., Komratova A., Isakov D. et al., , in: Computational linguistics and intellectual technologies. Papers from the Annual International Conference "Dialogue" (2025)Vol. 23.: [б.и.], 2025. P. 147–157.

The integration of automated speech analysis in diagnosing mental health disorders is becoming increasingly significant in both clinical and computational linguistics. This study aims to construct linguistic profiles for individuals with neurocognitive and affective mental disorders. Using speech transcriptions and relevant to the study computational techniques like lexical clustering and stylostatistical analysis, this research looks ...

Added: October 19, 2025

ОТСЛЕЖИВАНИЕ РАЗВИТИЯ РАЗРУШЕНИЯ С ПОМОЩЬЮ КЛАСТЕРИЗАЦИИ ИМПУЛЬСОВ ТЕРМИЧЕСКИ СТИМУЛИРОВАННОЙ АКУСТИЧЕСКОЙ ЭМИССИИ ПРИ ОТСУТСТВИИ ЛОКАЦИИ

Индаков Г. С., Казначеев П. А., Майбук З. Я. et al., Геофизические исследования 2025 Т. 26 № 2 С. 99–124

The paper studies the clusterability of acoustic emission pulses during high-temperature heating of sandstone sample preliminarily subjected to mechanical loading. Mechanical loading was applied in uniaxial mode up to load close to destructive with appearance of signs of large cracks on the surface. After that, samples were subjected to thermal treatment up to 650 °C ...

Added: September 19, 2025

Анализ тематики повседневных разговоров: экспертный подход и автоматические методы

Sherstinova T., Вепринцева Д. А., Человек: образ и сущность. Гуманитарные аспекты 2025 № 2(62) С. 89–108

В статье рассматриваются три разных подхода к изучению тематики повседневных разговоров: экспертная тематическая разметка и два автоматических метода (тематическое моделирование и кластеризация). Материалом для исследования послужили расшифровки русской устной повседневной речи из корпуса ОРД, подготовленные на основе звукозаписей спонтанных разговоров, выполненных в естественных коммуникативных ситуациях (дома, на работе, в учебном заведении, в магазине, в поликлинике ...

Added: September 3, 2025

Maksimov A. G., Telezhkina M., / NRU Higher School of Economics. Series EC "Economics". 2024. No. 271.

The paper examines similarity of models with structural changes among heterogeneous panel data units. We propose applying a cosine metric to compare angles between vectors of weighted coefficients as a measure of closeness of economic models. Testing whether the cosine metric value is zero against nonzero, positive, and negative alternatives enriches traditional testing results. The ...

Added: March 10, 2025

Метод туннельной кластеризации

Aleskerov F. T., Myachin A. L., Yakuba V. I., Доклады Российской академии наук. Математика, информатика, процессы управления (ранее - Доклады Академии Наук. Математика) 2024 Т. 520 № 1 С. 29–34

Предлагается новый метод быстрого поиска закономерностей в числовых данных большой раз-мерности, названный “туннельной кластеризацией”. Основными преимуществами нового методаявляются: относительно невысокая вычислительная сложность; эндогенное определение составаи количества кластеров; высокая степень интерпретируемости конечных результатов. Приведеноописание трех различных вариаций: с фиксированными гиперпараметрами, адаптивными, а так-же комбинированный подход. Рассмотрены три основных свойства туннельной кластеризации.Практическое применение приведено как на синтетических ...

Added: March 3, 2025

Tunnel Clustering Method

F. T. Aleskerov, A. L. Myachin, V. I. Yakuba, Doklady Mathematics 2024 Vol. 110 No. 3 P. 474–479

We propose a novel method for rapid pattern analysis of high-dimensional numerical data, termed tunnel clustering. The main advantages of the method are its relatively low computational complexity, endogenous determination of cluster composition and number, and a high degree of interpretability of final results. We present descriptions of three different variations: one with fixed hyperparameters, ...

Added: March 3, 2025

Использование Z-чисел для описания набора данных

Гусейнов О., Degtyarev K. Y., IRETC MTÜ PAHTEI - Proceedings of Azerbaijan High Technical Educational Institutions 2025 Т. 48 № 1 С. 360–370

The concept of Z-number was proposed by Prof. Lotfi Zadeh to describe partial reliability of information, and it is a kind of fusion of fuzziness and probabilistic uncertainty. Z-number can be presented as a pair of fuzzy numbers Z(A,B) used to describe a value of a random variable X. The first component (A) is a ...

Added: February 20, 2025

Gradient descent clustering with regularization to recover communities in transformed attributed networks

Shalileh S., Social Network Analysis and Mining 2025 Vol. 15212 P. 137–148

Community detection in attributed networks aims to recover clusters in which the within-community nodes are as interconnected and as homogeneous as possible, while the between-communities nodes are as disconnected and as heterogeneous as possible. The current research proposes a straightforward data-driven model with an integrated regularization term to recover communities. For further improvement of the ...

Added: November 30, 2024

An empirical scrutinization of four crisp clustering methods with four distance metrics and one straightforward interpretation rule

T. A. Alvandyan, S. Shalileh, Doklady Mathematics 2024 Vol. 110 No. S1 P. S236–S250

Clustering has always been in great demand by scientific and industrial communities. However, due to the lack of ground truth, interpreting its obtained results can be debatable. The current research provides an empirical benchmark on the efficiency of three popular and one recently proposed crisp clustering methods. To this end, we extensively analyzed these (four) ...

Added: November 30, 2024