Topic models can improve domain term extraction

E. I. Bolshakova; N. V. Loukachevitch; Nokel M.

?

Topic models can improve domain term extraction

P. 684–687.

Bolshakova E. I., Loukachevitch N. V., Nokel M.

The paper describes the results of an experimental study of topic models applied to the task of single-word term extraction. The experiments encompass several probabilistic and non-probabilistic topic models and demonstrate that topic information improves the quality of term extraction, as well as NMF with KL-divergence minimization is the best among the models under study.

Language: English

Keywords: clustering Topic Models Single-Word Term Extraction

In book

Proc. 35th European Conference on Information Retrieval (ECIR 2013): Advances in Information Retrieval

Vol. 7814. , Springer, 2013.

Тематические модели в задаче извлечения однословных терминов

М.А. Нокель, Н.В. Лукашевич, Программная инженерия 2014 № 3 С. 34–40

The paper describes the results of an experimental study of statistical topic models applied to the task of automatic single-word term extraction. The English part of the Europarl parallel corpus from the socio-political domain and the Russian articles taken from online banking magazines were used as target text collections. The experiments demonstrate that topic information ...

Added: October 1, 2014

Topic Models Can Improve Domain Term Extraction

Elena Bolshakova, Natalia Loukachevitch, Nokel M., , in: Proc. 35th European Conference on Information Retrieval (ECIR 2013): Advances in Information RetrievalVol. 7814. Springer, 2013. P. 684–687.

Abstract. The paper describes the results of an experimental study of topic models applied to the task of single-word term extraction. The experiments encompass several probabilistic and non-probabilistic topic models and demonstrate that topic information improves the quality of term extraction, as well as NMF with KL-divergence minimization is the best among the models under study. ...

Added: October 1, 2014

Использование тематических моделей в извлечении однословных терминов

Нокель М.А., Лукашевич Н.В., В кн.: Selected Papers of the 15th All-Russian Scientific Conference "Digital Libraries: Advanced Methods and Technologies, Digital Collections", Yaroslavl, Russia, October 14-17, 2013Vol. 1108. CEUR Workshop Proceedings, 2013. С. 52–60.

В статье представлены результаты экспериментов по применению тематических моделей к задаче извлечения однословных терминов. В качестве текстовых коллекций была взята подборка статей из электронных банковских журналов на русском языке и англоязычная часть корпуса параллельных текстов Europal. Эксперименты показывают, что использование тематической информации значительно улучшает качество извлечения однословных терминов независимо от предметной области и используемого языка. ...

Added: October 1, 2014

Combining Lexical Substitutes in Neural Word Sense Induction

Nikolay Arefyev, Boris S., Panchenko A., , in: Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2019. INCOMA Ltd, 2019. P. 62–70.

Word Sense Induction (WSI) is the task of grouping of occurrences of an ambiguous word according to their meaning. In this work, we improve the approach to WSI proposed by Amrami and Goldberg (2018) based on clustering of lexical substitutes for an ambiguous word in a particular context obtained from neural language models. Namely, we ...

Added: October 9, 2020

Clustering cities based on their development dynamics and Variable neighborhood search

B. Zhikharevich, Electronic Notes in Discrete Mathematics 2015 No. 47 P. 213–220

Abstract Clustering cities based on their socio-economic development in long time period is an important issue and may be used in many ways, e.g., in strategic regional planning. In this paper we continue our recent study where cumulative attribute for each year replaces nine other attributes, called ’vector of dynamics’. In our previous paper some ...

Added: October 11, 2015

CEE-SECR '19 Proceedings of the 15th Central and Eastern European Software Engineering Conference in Russia

Silakov D., NY: ACM, 2019.

Added: November 20, 2019

Clustering and Generalized ANOVA for Symbolic Data Constructed from Open Data

Korenjak–Cerne S., Kejzar N., Batagelj V., , in: Advances in Data Sciences: Symbolic, Complex and Network Data. ISTE, Wiley, 2020. P. 209–228.

...

Added: December 10, 2019

ОБРАБОТКА И АНАЛИЗ РЕЗУЛЬТАТОВ МОНИТОРИНГОВ ДЛЯ УПРАВЛЕНИЯ ФОРМИРОВАНИЕМ УСЛОВИЙ КАЧЕСТВЕННОГО ОБРАЗОВАНИЯ

Shvindt A., Моделирование, оптимизация и информационные технологии 2017 Т. 5 № 4 С. 1–18

The article reviews models and procedures for processing and evaluation of monitoring results, including student participation, focused on intellectual support of administrative managerial decisions when developing of conditions and corresponding resources for the achievement of applicable regulatory requirements for the quality of university education. The first stage of processing is normalization of factors which characterize ...

Added: August 19, 2019

Clustering of Biomedical Data Using the Greedy Clustering Algorithm Based on Interval Pattern Concepts

Galatenko A. V., Nersisyan S., Pankratieva V., , in: Proceedings of the International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at IJCAI/ECAI 2019). [б.и.], 2019. P. 65–74.

nterval pattern concepts are a particular case of patternstructures. They can be used to clusterize rows of a numerical formalcontext (data matrix): two rows are close to each other if their entriesat the corresponding positions fall within a given interval.The problem of mining interval pattern concepts has much in commonwith the known problem related to ...

Added: April 28, 2020

Кластерный анализ кардиологических данных

Зимина Е. Ю., Статистика и Экономика 2018 Т. 15 № 2 С. 30–37

The article includes the observation of the cluster analysis of medical data on the example of the cardiac data. One of the main effective and commonly used Data Mining methods that applied to the large amounts of information (for example, mathematical economics) are clustering methods: the search for signs of similarity between objects in the study of the subject area ...

Added: May 29, 2018

Classification of normal and pathological brain networks based on similarity in graph partitions

Kurmukov A., Dodonova Y., Zhukov L. E., , in: 16th IEEE International Conference on Data Mining Workshops (ICDMW). NY: IEEE Computer Society, 2016. P. 107–112.

We consider a task of classifying normal and pathological brain networks. These networks (called connectomes) represent macroscale connections between predefined brain regions; hence, the nodes of connectomes are uniquely labeled and the set of labels (brain regions) is the same across different brains. We make use of this property and hypothesize that connectomes obtained from ...

Added: December 9, 2016

Научно-образовательный кластер как механизм становления инновационного типа партнерства в практике соразвития РФ - КНР: потенциал, приоритеты и вектор развития в приграничном социокультурном пространстве

Дубровская К. С., Morozova V., Journal of Siberian Federal University. Series: Humanities & Social Sciences 2016 Vol. 9 No. 11 P. 2575–2580

The article substantiates the necessity for creation and development of a scientiﬁ c-educational cluster under the conditions of activating Russian-Chinese co-development processes in the cross-border sociocultural medium. In the context of persistent expansion of Chinese “soft power”, clustering is more than a way of concentrating material and intellectual resources. Clustering means the only chance for ...

Added: July 20, 2018

An empirical comparison of connectivity-based distances on a graph and their computational scalability

Miasnikof P., Shestopaloff A., Pitsoulis L. et al., Journal of Complex Networks 2022 Vol. 10 No. 1 Article cnac003

In this study, we compare distance measures with respect to their ability to capture vertex community structure and the scalability of their computation. Our goal is to find a distance measure which can be used in an aggregate pairwise minimization clustering scheme. The minimization should lead to subsets of vertices with high induced subgraph density. ...

Added: November 21, 2022

APPLICATION OF DATA ENVELOPMENT ANALYSIS IN MANAGEMENT RESEARCH (CASE OF RUSSIAN DOMESTIC ENERGY SECTOR)

Volkova I., / NRU Higher School of Economics. Series MAN "Management". 2013.

The idea that different firms can be classified into relatively homogeneous groups has been popular for many years, and many typologies have been developed and tested using a variety of classification tools. It has become apparent, however, that most clustering tools are somewhat limited, because they create groups of companies based on similar characteristics, without ...

Added: February 18, 2014

Кластеризация медицинских больших данных как инструментарий систем поддержки принятия решений в математической кардиологии с использованием облачных технологий

Shmid A., Новопашин М. А., Зимина Е. Ю., Системный администратор 2018 Т. 188-189 № 07-08 С. 92–96

Массовое использование мобильных устройств для съема электрокардиограмм (ЭКГ) приводит к количественному росту доступных для исследования ЭКГ множества пациентов. Таким образом, появляются новые возможности исследования колебательных процессов долговременной динамики индивидуального состояния сердечно-сосудистой системы (ССС) любого пациента. В статье демонстрируются новые возможности долговременного постоянного наблюдения за состоянием ССС массы пациентов, позволяющие выявить закономерности динамики ССС, которые приводят к ...

Added: September 13, 2018

Russian Nationalist Movement Restructuring in light of the Ukrainian Events which took place in 2013-14

Rotmistrov A., / Series SSRN Working Paper Series "SSRN Working Paper Series". 2015.

The events in Ukraine in 2013-2014 attracted the Russian society’s attention and affected the Russian political agenda. One of the most affected sectors of the Russian domestic policy was Russian nationalist organizations. The issue of radical nationalism has become essential for European countries and for Russia in particular. But this object is rather difficult to ...

Added: October 15, 2015

Кластеризация агентов в модели ограниченного соседства

Akopov A. S., Beklaryan A., Искусственные общества 2020 Т. 15 № 3 С. 1–11

This article presents a new approach to designing agent-based bounded neighbourhood models (the Schelling’s models). An original agent-based model in the AnyLogic system has been developed, which describes the segregation processes caused by the behaviour patterns of agent-individuals. There are examined various scenarios (environment characteristics) affecting the cluster structure of the spatial distribution of agents. Using the proposed bounded ...

Added: September 14, 2020

The Minkowski central partition as a pointer to a suitable distance exponent and consensus partitioning

Mirkin B., Amorim R., Makarenkov V. et al., Pattern Recognition 2017 Vol. 67 P. 62–72

The Minkowski weighted K-means (MWK-means) is a recently developed clustering algorithm capable of computing feature weights. The cluster-specific weights in MWK-means follow the intuitive idea that a feature with low variance should have a greater weight than a feature with high variance. The final clustering found by this algorithm depends on the selection of the ...

Added: March 30, 2017

Analysis and interpretation of imaging mass spectrometry data by clustering mass-to-charge images according to their spatial similarity

Alexandrov T., Chernyavsky I., Becker M. et al., Analytical Chemistry 2013 Vol. 85 No. 23 P. 11189–11195

Imaging mass spectrometry (imaging MS) has emerged in the past decade as a label-free, spatially resolved, and multipurpose bioanalytical technique for direct analysis of biological samples from animal tissue, plant tissue, biofilms, and polymer films. Imaging MS has been successfully incorporated into many biomedical pipelines where it is usually applied in the so-called untargeted mode-capturing spatial localization of a multitude of ions ...

Added: November 18, 2013

Breeds of cooccurrence: an attempt at classification

Roytberg M.A., Roytberg A.M., Khachko D. V., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной Международной конференции «Диалог» (Бекасово, 29 мая - 2 июня 2013 г.). В 2-х т.Т. 1: Основная программа конференции. Вып. 12 (19). М.: РГГУ, 2013. P. 568–578.

The paper proposes a substantial classification of collocates (pairs of words that tend to cooccur) along with heuristics that can help to attibute a word pair to a proper type automatically. The best studied type is frequent phrases, which includes idioms, lexicographic collocations, and syntactic selection. Pairs of this type are known to occur at a ...

Added: May 6, 2014

Usage of Clustering of Paley Graphs in Polar Coordinates for the Development of New Network on Chip Topologies

Alijon F. Fatullaev, Edward R. Rzaev, Aleksandr Yu. Romanov, , in: 2022 International Russian Automation Conference (RusAutoCon). IEEE, 2022. P. 419–423.

The article presents a study of clustering of Paley graphs with the arrangement of prime numbers in polar coordinates and a comparison of the resulting groups in terms of their static parameters; the application of fault-tolerant self-organizing routing method for new topologies is also considered. This article is a continuation of a series of articles ...

Added: October 2, 2022

Clustering cities based on their development dynamics and Variable neigborhood search

B. S. Zhikharevich, Electronic Notes in Discrete Mathematics 2015 No. 47 P. 213–220

Clustering cities based on their socio-economic development in long time period is an important issue and may be used in many ways, e.g., in strategic regional planning. In this paper we continue our recent study where cumulative attribute for each year replaces nine other attributes, called ’vector of dynamics’. In our previous paper some original ranking method was proposed. ...

Added: November 12, 2015

Formal Concept Analysis: 16th International Conference, ICFCA 2021, Strasbourg, France, June 29 – July 2, 2021, Proceedings

Springer, 2021.

This book constitutes the proceedings of the 16th International Conference on Formal Concept Analysis, ICFCA 2021, held in Strasbourg, France, in June/July 2021. The 14 full papers and 5 short papers presented in this volume were carefully reviewed and selected from 32 submissions. The book also contains four invited contributions in full paper length. The research part ...

Added: July 10, 2021