Braverman’s Spectrum and Matrix Diagonalization Versus iK-Means: A Unified Framework for Clustering .
In this paper, I discuss current developments in cluster analysis to bring forth earlier developments by E. Braverman and his team. Speciﬁcally, I begin by recalling their Spectrum clustering method and Matrix diagonalization criterion. These two include a number of userspeciﬁed parameters such as the number of clusters and similarity threshold, which corresponds to the state of aﬀairs as it was at early stages of data science developments; it remains so currently, too. Meanwhile, a data-recovery view of the Principal Component Analysis method admits a natural extension to clustering which embraces two of the most popular clustering methods, K-Means partitioning and Ward agglomerative clustering. To see that, one needs just adjusting the point of view and recognising an equivaent complementary criterion demanding the cluster to be simultaneously “large-sized” and “anomalous”. Moreover, this paradigm shows that the complementary criterion can be reformulated in terms of object-to-object similarities. This criterion appears to be equivalent to the heuristic Matrix diagonalization criterion by Dorofeyuk-Braverman. Moreover, a greedy one-by-one cluster extraction algorithm for this criterion appears to be a version of the Braverman’s Spectrum algorithm – but with automated adjustment of parameters. An illustrative example with mixed scale data completes the presentation.