By Topic

Improvements in the Partitions Selection Strategy for Set of Clustering Solutions

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Sakata, T.C. ; Univ. Fed. de Sao Carlos - Campus Sorocaba, Sorocaba, Brazil ; Faceli, K. ; de Souto, M.C.P. ; de Carvalho, A.C.P.L.F.

No clustering algorithm is guaranteed to find actual groups in any dataset. Thus, the selection of the most suitable clustering algorithm to be applied to a given dataset is not easy. To deal with this problem, one can apply various clustering algorithms to the dataset, generating a set of partitions (solutions). Next, one can choose the best partition generated, according to a given validation measure - such measures are usually biased towards one or more clustering algorithms. However, in many cases, it is interesting to have more than one solution. In a previous work, we proposed a selection strategy able to reduce the number of solutions obtained from Pareto-based multi-objective genetic algorithms. This selection strategy uses the correct Rand index to select a subset of the most different partitions. The size of the solutions' set is controlled by a threshold of the value of this index, given as an external parameter. The reduction of the threshold value decreases the number of solutions. Since the choice of such a threshold value is not intuitive, this paper describes a modification of the original selection algorithm that automatically adjusts this threshold and guarantees the selection of the most evident partitions, which was simultaneously obtained with distinct clustering criteria. The new version does not require any user settings, presents a better number of solutions and maintains the diversity of the partitions in the reduced set.

Published in:

Neural Networks (SBRN), 2010 Eleventh Brazilian Symposium on

Date of Conference:

23-28 Oct. 2010