?
Using the gradient descent approach and automatic differentiation to enhance clustering results
Effective clustering has always been in great demand in various disciplines, and enhancing its effectiveness is always desirable. In this work, inspired by the triumph of gradient descent (GD) and automatic differentiation (AD) in the neural networks domain, we utilized them to enhance the effectiveness of clustering methods. The GD update rules can be classified into non-adaptive and adaptive classes. The former updates whole optimization parameters equally: the vanilla update rule is prone to slow convergence or overshooting close to the bottom of the valley, and momentum-based update rules were proposed to tackle this issue. The instances of this class assign more weight to the parameters with constantly high gradient values, and the non-adaptive update rules were proposed to address this concern. Keeping this in mind, we used the vanilla update rule, one momentum-based and one non-adaptive update rule to optimize our proposed generic objective function for crisp clustering; as a result, we tested three versions of GDC methods. In addition, as an auxiliary device for further improvements, we implemented our proposed methods using an AD library to equip the users to exploit any differentiable distance functions. We empirically scrutinized, validated, and compared the performance of our proposed methods with four popular and effective clustering methods from the literature on 21 real-world and 720 synthetic data sets. Our experiments proved that our proposed methods are valid and usually more effective than the competitors.