Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads

B. Mirkin

doi:10.1007/s00357-010-9049-5

Publications

?

Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads

Journal of Classification. 2010. Vol. 27. No. 1. P. 3–40.

Chiang M., Mirkin B.

The issue of determining “the right number of clusters” in K-Means has attracted considerable interest, especially in the recent years. Cluster intermix appears to be a factor most affecting the clustering results. This paper proposes an experimental setting for comparison of different approaches at data generated from Gaussian clusters with the controlled parameters of between- and within-cluster spread to model cluster intermix. The setting allows for evaluating the centroid recovery on par with conventional evaluation of the cluster recovery. The subjects of our interest are two versions of the “intelligent” K-Means method, ik-Means, that find the “right” number of clusters by extracting “anomalous patterns” from the data one-by-one. We compare them with seven other methods, including Hartigan’s rule, averaged Silhouette width and Gap statistic, under different between- and within-cluster spread-shape conditions. There are several consistent patterns in the results of our experiments, such as that the right K is reproduced best by Hartigan’s rule – but not clusters or their centroids. This leads us to propose an adjusted version of iK-Means, which performs well in the current experiment setting.

Research target: Economics and Management

Priority areas: economics business informatics

Language: English

Full text

DOI

Keywords: кластеризация k-means K-Means clustering number of clusters anomalous pattern gap statistic метод k-средних количество кластеров аномальный образец разрыв статистики экономико-математические методы и модели

БАЗОВЫЕ ТАБЛИЦЫ «ЗАТРАТЫ-ВЫПУСК» ЗА 2021 ГОД: НЕКОТОРЫЕ АСПЕКТЫ ПРИМЕНЕНИЯ В ИССЛЕДОВАНИЯХ

Kalinin A. M., Проблемы прогнозирования 2026 № 1 С. 45–55

Rosstat's publication of basic input–output tables for 2021, given the five-year periodicity of such materials, represents a significant event for Russian economic science. The article provides an overview of some of the features and properties of the published tables and the results of their use for the analysis of technological changes and import dependence. It ...

Added: May 26, 2026

Китай как новая экономическая империя: проблемы роста и модели взаимодействия с государствами Глобального Юга

Tkachuk A., Романова В. В., Экономические науки 2026 № 2 С. 561–565

Актуальность статьи обусловлена трансформацией мирохозяйственных связей под влиянием Китая. Степень разработанности отражена в полярных оценках экспансии КНР. Объект — экономическая экспансия Китая, предмет — модели взаимодействия со странами Глобального Юга. Методология базируется на геоэкономическом подходе и концепции асимметричной взаимозависимости. Практическая значимость заключается в выявлении рисков долговой и технологической зависимости и рекомендациях для стран-партнеров по выстраиванию сбалансированного сотрудничества. ...

Added: May 26, 2026

ОЦЕНКА ВОЗДЕЙСТВИЯ АДРЕСНЫХ (ПЕРСОНАЛЬНЫХ) САНКЦИЙ США НА РОССИЙСКИЕ КОМПАНИИ В 2014-2023 ГГ.

Kalinin A. M., Вопросы экономики 2025 № 8 С. 26–43

The study of the consequences of the application of individual sanctions instruments at the level of industries or segments of the economy allows us to assess the components and factors of overall sustainability, the effectiveness of the state policy. The paper provides an econometric assessment of the impact of US SDN list sanctions on Russian ...

Added: May 26, 2026

ФАКТОРЫ ИНВЕСТИЦИОННОЙ АКТИВНОСТИ В РОССИЙСКОЙ ЭКОНОМИКЕ: ВЫВОДЫ 2022 Г

Kalinin A. M., Проблемы прогнозирования 2024 № 1 С. 35–53

The dynamics of the Russian economy in 2022 did not develop along the trajectories expected in most forecasts. Investments in fixed assets, contrary to expectations, decreased slightly; some industries have seen growth. The article offers an analysis of the 2022 results from the point of view of justification of views on the factors determining investment ...

Added: May 26, 2026

ИМПОРТОЗАВИСИМОСТЬ И ИМПОРТОЗАМЕЩЕНИЕ В РОССИИ: ОЦЕНКА НА ОСНОВЕ ТАБЛИЦ РЕСУРСОВ И ИСПОЛЬЗОВАНИЯ

Kalinin A. M., Проблемы прогнозирования 2024 № 2 С. 21–33

The article discusses issues related to the assessment of import dependence and the results of the import substitution policy in the period 2016–2020. An overview of options for assessing import dependence indicators is provided, and the possibility of using resource and use tables published by Rosstat is substantiated. The main analysis tool is calculated tables ...

Added: May 26, 2026

ЧЛЕНСТВО РОССИИ В ВТО: ОЦЕНКА ВОЗДЕЙСТВИЯ НА ВНЕШНЮЮ ТОРГОВЛЮ

Kalinin A. M., Вопросы экономики 2023 № 7 С. 100–114

The paper assesses some of the consequences of Russia’s accession to the World Trade Organization (WTO). Whereas much discussion about its costs and benefits happened during the WTO negotiations and in the first years after the accession, there was little said on the real consequences afterwards. The aim of the research is to assess the ...

Added: May 26, 2026

СОКРАЩЕНИЕ ИНОСТРАННОГО ПРИСУТСТВИЯ В РОССИЙСКОЙ ОБРАБАТЫВАЮЩЕЙ ПРОМЫШЛЕННОСТИ: РЕЗУЛЬТАТЫ 2022 Г. И БЛИЖАЙШИЕ ПЕРСПЕКТИВЫ

Kalinin A. M., Общество и экономика 2023 № 4 С. 53–63

The withdrawal of foreign companies from the Russian market due to a change in the political situation has been a new trend for the Russian economy. The study shows that the actual scale of closure of foreign legal entities is insignificant, and even smaller than in previous years: the maximum number of legal entities was ...

Added: May 26, 2026

ИСПОЛЬЗОВАНИЕ ПРИНЦИПА КЛИЕНТОЦЕНТРИЧНОСТИ В ГОСУДАРСТВЕННОМ УПРАВЛЕНИИ: ПОВЕСТКА ВНЕДРЕНИЯ

Kalinin A. M., Вопросы государственного и муниципального управления 2023 № 3 С. 7–25

Customer-centricity (client-centricity) is a new concept for Russian public governance. However, since 2022, the implementation of the relevant principles has been brought to the level of a national strategic initiative. The article proposes an analysis of the issues to be considered when implementing customer-centricity in the practice of government agencies and institutions. The article examines ...

Added: May 26, 2026

Оценка влияния внешних факторов на цифровую зрелость органов государственного управления в субъектах Российской Федерации

Styrin E. M., Ataeva A., Жатикова Д. В. et al., Регион: Экономика и Социология 2026 № 2 С. 31–55

There remains a pronounced interregional differentiation in the level of digital development across the Russian Federation, which necessitates a comprehensive analysis of the factors influencing the digital maturity of public administration bodies in the country’s regions. This article offers a quantitative assessment of the impact of external socioeconomic, institutional, and technological factors on the level ...

Added: May 26, 2026

Reliability and Statistics in Transportation and Communication: Ecosystems for Smart Connectivity and Intelligent Mobility. Selected Papers from the 25th International Multidisciplinary Conference on Reliability and Statistics in Transportation and Communication, RelStat-2025, October 16–18, 2025, Riga, Latvia

Switzerland: Springer Publishing Company, 2026.

In this volume of “Lecture Notes in Networks and Systems” we are pleased to present the proceedings of the 25th International Multidisciplinary Conference on Reliability and Statistics in Transportation and Communication” (RelStat-2025), which took place in hybrid form in Riga, Latvia on October 16–18, 2025. This event belongs to a con ference series started in ...

Added: May 25, 2026

Demand for consumer loans in Russia: How strong is the interest rate channel of monetary policy?

Шелованова Т. И., Синяков А. А., Russian Journal of Economics 2025 No. 11 P. 47–75

The booming retail trade and the above-target consumer prices inflation in 2023–2024 in Russia, amid tightening monetary policy stance, raise an issue of the strength of the monetary policy interest rate channel. The focus of our paper is the interest rate elasticity (given inflation expectations) of a household’s loan request probability. We argue that a ...

Added: May 25, 2026

Региональная экономика и развитие территорий : сборник научных статей. Вып. 19

Kaisarova V. P., Коклев К. С., СПб.: Издательство СПбГЭУ, 2025.

The article examines the trends in the development of territorial accessibility of public transport for the population of the largest city based on the geoinformation analysis of the spatial features of its infrastructure. The article provides a transport dynamic for the period 2015-2024 based on cluster analysis of data on individual types of transport and ...

Added: May 25, 2026

Unemployment and online job boards in Russia: A Beveridge curve perspective

Paklina S., Parshakov P., Teplykh G., The Journal of the New Economic Association 2026 Vol. 70 No. 1 P. 279–301

This paper investigates the relationship between online recruitment and unemployment in Russia within the Beveridge curve (the unemployment–vacancy curve) framework. Using panel data for 81 Russian regions over the period 2006–2022, we examine how the expansion of online job boards affects regional unemployment dynamics. The empirical analysis is based on fi xed- effects and instrumental- variable ...

Added: May 22, 2026

Effects of the ECB’s monetary policy on sovereign bonds pricing

Rincon C. J., Alekseeva O., Vukovic D. et al., Risk Management 2026 No. 28 Article 38

This study examines the long-term effects of the European Central Bank’s (ECB) unconventional monetary policy (UMP) interventions on the yields of sovereign bonds in the Eurozone. Using a sample of 14 European countries from January 2009 to December 2023, our findings indicate that increases of 1 billion euros in the ECB’s balance sheet are associated ...

Added: May 22, 2026

Проблемы интеграции культурного наследия в креативные индустрии Республики Тыва

Монгуш В. Р., Novikova A., Креативные индустрии 2026 Т. 2 № 1 С. 23–41

This article analyzes the historical and cultural background, as well as the current situation and development prospects of the creative industries ecosystem in the Republic of Tuva. A comparative analysis of this remote, subsidized region and its neighbors, the Sakha Republic (Yakutia) and Krasnoyarsk Krai, revealed its strengths, vulnerabilities, and strategies of young creative professionals ...

Added: May 21, 2026

АНАЛИЗ ИННОВАЦИОННОЙ ИНФРАСТРУКТУРЫ ВУЗОВ РОССИИ

Zizin A., Plotnikov S., Деева В. В. et al., Инновации и инвестиции 2025 № 6 С. 36–39

The article is devoted to the study of the innovative infrastructure of Russian universities. The importance of developing innovative infrastructure, the practice of their formation (using the example of the country's leading universities) is studied. A classification of infrastructure objects is given and the effectiveness of their use in the practice of Russian universities is ...

Added: May 21, 2026

ТЕХНОЛОГИИ В КОРПОРАТИВНОМ ИННОВАЦИОННОМ ПРЕДПРИНИМАТЕЛЬСТВЕ

Plotnikov S., Деева В. В., Тенденции развития науки и образования 2023 № 102-2 С. 113–120

In the scientific article are considered the main technologies of corporate innovative entrepreneurship. Each technology is examined separately, which allows for understanding the purposefulness of using each tool in practice at your company. ...

Added: May 21, 2026

ЦИКЛЫ КИТЧИНА И СУЩНОСТЬ ИХ ПРОЯВЛЕНИЯ

Plotnikov S., Деева В. В., Деева А. В., Тенденции развития науки и образования 2023 № 102-2 С. 64–67

Short-term cyclicity in modern economic systems is caused by a number of factors, including insufficient information. Information about the state of the market, supply and demand is received with a delay due to the processes of its collection, processing and dissemination. The time lag in obtaining information affects the business activity of enterprises, since they ...

Added: May 21, 2026

Covariate-Balanced Weighted Stacked Difference-in-Differences

Ustyuzhanin V., / Series Econometrics "arxiv". 2026.

This paper proposes Covariate-Balanced Weighted Stacked Difference-in-Differences (CBWSDID), a design-based extension of weighted stacked DID for settings in which untreated trends may be conditionally rather than unconditionally parallel. The estimator separates within-subexperiment design adjustment from across-subexperiment aggregation: matching or weighting improves treated-control comparability within each stacked subexperiment, while the corrective stacked weights of Wing et ...

Added: April 3, 2026

Загадка внутренней мотивации

Vorchik A., / Social Science Research Network. Серия SSRN Working Paper Series "SSRN Working Paper Series". 2026.

This article is devoted to the phenomenon of intrinsic motivation, to understand which two models are proposed. We study how positive/negative intrinsic motivation to work (experienced utility) affects worker's individual labour supply (model I) and the amount of effort they exert (model II). In model I, we use intrinsic motivation to explain the positive/negative slope ...

Added: March 15, 2026

Up and Down the Mount Stupid: An Emotional Explanation of the Dunning-Kruger Effect

Vorchik A., Мамышев М. А., / Series Social Science Research Network "Social Science Research Network". 2025.

In this paper, we develop a formal mathematical model aimed to explain the Dunning-Kruger effect that beginners systematically overestimate their own competence in various fields of knowledge and activity. We argue that the Dunning-Kruger effect arises from the emotional nature of confidence combined with unknown unknowns that it simply can not take into account due ...

Added: February 11, 2026

Microfoundations of the Cultural Modernization Theory

Musaev A. U., Vorchik A., / Series Social Science Research Network "Social Science Research Network". 2026.

This paper attempts to model the evolutionary theory of modernization and democratization. The model reflects the key provisions of R. Inglehart and C. Welzel's theory and provides a microfoundation for the adaptation of subjective values to the objective importances of the survival factors and the structure of the labour markets from the perspective of evolutionary ...

Added: February 10, 2026

Support Link Formation in Contests: Theory and an Experiment

Antsygina A., Teteryatnikova M., Tremewan J. C. et al., / Series "SSRN Working Paper Series". 2025.

Many competitive environments allow for a third party to be indirectly involved by supporting one or both sides in the conflict. Such support can come from trade partners, colleagues, or allies, who can in turn benefit from a supported party's success. We use theory and an experiment to investigate how support relationships develop endogenously in ...

Added: January 31, 2026

Inaction Inertia in Economic Decision Making: The Role of Reference Points

Akhmedova A., / Series "SSRN Working Paper Series". 2026.

The study explores a psychological phenomenon of inaction inertia-avoiding action after missing a more favourable opportunity. Unlike action inertia (e.g.,sunk costs effect), inaction inertia has been less studied, particularly in economic contexts. Considering the reference dependent nature of the phenomenon, I build on the work of Kőszegi and Rabin (2006) to examine how past experiences ...

Added: January 23, 2026