Clustering: A Data Recovery Approach
One of the goals of the first edition of this book back in 2005 was to present a coherent theory for K-Means partitioning and Ward hierarchical clustering. This theory leads to effective data pre-processing options, clustering algorithms and interpretation aids, as well as to firm relations to other areas of data analysis. The goal of this second edition is to consolidate, strengthen and extend this island of understanding in the light of recent developments. Moreover, the material on validation and interpretation of clusters is updated with a system better reflecting the current state of the art and with our recent ``lifting in taxonomies'' approach. The structure of the book has been streamlined by adding two Chapters: ``Similarity Clustering'' and ``Validation and Interpretation'', while removing two chapters: ``Different Clustering Approaches'' and ``General Issues.'' The Chapter on Mathematics of the data recovery approach, in a much extended version, almost doubled in size, now concludes the book. Parts of the removed chapters are integrated within the new structure. The change has added a hundred pages and a couple of dozen examples to the text and, in fact, transformed it into a different species of a book. In the first edition, the book had a Russian doll structure, with a core and a couple of nested shells around. Now it is a linear structure presentation of the data recovery clustering.