By Topic

A fast approximate kernel k-means clustering method for large data sets

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Sarma, T.H. ; Dept. of Comput. Sci. & Eng., Rajeev Gandhi Memorial Coll. of Eng. & Technol., Nandyal, India ; Viswanath, P. ; Reddy, B.E.

In unsupervised classification, kernel k-means clustering method has been shown to perform better than conventional k-means clustering method in identifying non-isotropic clusters in a data set. The space and time requirements of this method are O(n2), where n is the data set size. The paper proposes a two stage hybrid approach to speed-up the kernel k-means clustering method. In the first stage, the data set is divided in to a number of group-lets by employing a fast clustering method called leaders clustering method. Each group-let is represented by a prototype called its leader. The set of leaders, which depends on a threshold parameter, can be derived in O(n) time. The paper presents a modification to the leaders clustering method where group-lets are found in the kernel space (not in the input space), but are represented by leaders in the input space. In the second stage, kernel k-means clustering method is applied with the set of leaders to derive a partition of the set of leaders. Finally, each leader is replaced by its group to get a partition of the data set. The proposed method has time complexity of O(n+p2), where p is the leaders set size. Its space complexity is also O(n+p2). The proposed method can be easily implemented. Experimental results shows that, with a small loss of quality, the proposed method can significantly reduce the time taken than the conventional kernel k-means clustering method.

Published in:

Recent Advances in Intelligent Computational Systems (RAICS), 2011 IEEE

Date of Conference:

22-24 Sept. 2011