An Extension of K-Means for Least-Squares Community Detection in Feature-Rich Networks
We propose an extension of the celebrated K-means algorithm for community detection in feature-rich networks. Our least-squares criterion leads to a straightforward extension of the conventional batch K-means clustering method as an alternating optimization strategy for the criterion. By replacing the innate squared Euclidean distance with cosine distance we effectively tackle the so-called curse of dimensionality. We compare our proposed methods using synthetic and real-world data with state-of-the-art algorithms from the literature. The cosine distance-based version appears to be the overall winner, especially at larger datasets.