Skip to Main Content
Notice of Violation of IEEE Publication Principles
"K-means versus K-means ++ Clustering Technique"
by Shalove Agarwal, Shashank Yadav, and Kanchan Singh
in the Proceedings of the 2012 Students Conference on Engineering and Systems (SCES) March 2012
After careful and considered review of the content and authorship of this paper by a duly constituted expert committee, this paper has been found to be in violation of IEEE's Publication Principles.
This paper contains coped material from the original paper cited below. The original text was copied without attribution (including appropriate references to the original author(s) and/or paper title) and without permission.
"K-means++: The Advantages of Careful Seeding"
by David Arthur and Sergei Vassilvitskii
in the Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) 2007, pp. 1027-1035
"Improved K-Medoids Clustering Based on Cluster Validity Index and Object Density"
by Bharat Pardeshi and Durga Toshniwal
in the Proceedings of the 2010 IEEE 2nd International Advance Computing Conference (IACC), February 2010, pp. 379-384
The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no guaranteed accuracy, its simplicity and speed are very appealing in practice. In this paper, we present a way of initializing k-means by choosing random starting centers with very specific probabilities. By augmenting k-means with a very simple, randomized seeding technique, we obtain an algorithm that is (log k)-competitive with the optimal clustering. Preliminary experiments show that the augmentation improves both the speed and the accuracy of k-means.