Summary and semi-average similarity criteria for individual clusters

B. Mirkin

?

Summary and semi-average similarity criteria for individual clusters

P. 101–126.

There exists much prejudice against the within-cluster summary similarity criterion which supposedly leads to collecting all the entities in one cluster. This is not so if the similarity matrix is pre-processed by subtraction of ``noise'', of which two ways, the uniform and modularity, are mentioned in the paper. Another criterion under consideration is the semi-average within-cluster similarity, which manifests more versatile properties. In fact, both types of criteria emerge in relation to the least-squares data approximation approach to clustering, as shown in the paper. A very simple local optimization algorithm, Add-and-Remove($S$), leads to a suboptimal cluster satisfying some tightness conditions. Three versions of an iterative extraction approach are considered, leading to a portrayal of the cluster structure of the data. Of these, probably most promising is what is referred to as the incjunctive clustering approach. Applications are considered to the analysis of semantics, to integrating different knowledge aspects and consensus clustering.

Language: English

Full text

Keywords: consensus clustering measure of similarity single cluster kernel cluster

Publication based on the results of:

An investigation of new methods of mathematical modelling and mechanism design in the social, economic and political sciences (2013)

In book

Models, Algorithms, and Technologies for Network Analysis

Vol. 59. , NY: Springer, 2013.

Individual approximate clusters: methods, properties, applications

Mirkin B., , in: Rough Sets, Fuzzy Sets, Data Mining, and Granular ComputingIssue 8170: Lecture Notes in Artificial Intelligence. Heidelberg: Springer, 2013. P. 26–37.

A least-squares data approximation approach to finding individual clusters is advocated. A simple local optimization algorithm leads to suboptimal clusters satisfying some natural tightness criteria. Three versions of an iterative extraction approach are considered, leading to a portrayal of the cluster structure of the data. Of these, probably most promising is what is referred to ...

Added: October 29, 2013

Least squares consensus clustering: criteria, methods, experiments

Mirkin B., Shestakoff A., , in: Advances in Information Retrieval. L.: Springer, 2013. P. 764–768.

We develop a consensus clustering framework developed three decades ago in Russia and experimentally demonstrate that our least squares consensus clustering algorithm consistently outperforms several recent consensus clustering methods. ...

Added: April 15, 2013

Comparative Analysis of Two Similarity Measures for the Market Graph Construction

Bautin G. A., Kalyagin V. A., Koldanov A. P., Springer Proceedings in Mathematics & Statistics 2013 Vol. 59 P. 29–41

Market graph is built on the basis of some similarity measure for financial asset returns. The paper considers two similarity measures: classic Pearson correlation and sign correlation. We study the associated market graphs and compare the conditional risk of the market graph construction for these two measures of similarity. Our main finding is that the ...

Added: September 27, 2013

A Note on the Effectiveness of the Least Squares Consensus Clustering

Mirkin B., Shestakoff A., , in: Clusters, orders, trees: methods and applications. In Honor of Boris Mirkin's 70th BirthdayVol. 92. Berlin: Springer, 2014.

We develop a consensus clustering framework proposed three decades ago in Russia and experimentally demonstrate that our least squares consensus clustering algorithm consistently outperforms several recent consensus clustering methods. ...

Added: January 23, 2015

Least-squares consensus clustering versus: (a) other consensus approaches and (b) k-means

Mirkin B., Andrey Shestakov, , in: Clusters, orders, trees: methods and applications. In Honor of Boris Mirkin's 70th BirthdayVol. 92. Berlin: Springer, 2014.

Added: November 4, 2013

DATA ANALYTICS 2014, The Third International Conference on Data Analytics

[б.и.], 2014.

Full texts of third international conference on data analytics are presented. ...

Added: October 13, 2014

A Lattice-based Consensus Clustering Algorithm

Бочаров А. А., Gnatyshak D. V., Ignatov D. I. et al., , in: CLA 2016: Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications. CEUR Workshop ProceedingsVol. 1624. M.: Higher School of Economics, National Research University, 2016. P. 45–56.

We propose a new algorithm for consensus clustering, FCA-Consensus, based on Formal Concept Analysis. As the input, the algorithm takes T partitions of a certain set of objects obtained by k-means algorithm after T runs from different initialisations. The resulting consensus partition is extracted from an antichain of the concept lattice built on a formal ...

Added: October 24, 2016