By Topic

A Semi-supervised Approach to Estimate the Number of Clusters per Class

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Davidson M. Sestaro ; Catholic Univ. of Santos (UniSantos) at Santos, Santos, Brazil ; Thiago F. Covões ; Eduardo R. Hruschka

The disparity between the available amount of unlabeled and labeled data in several applications made semi-supervised learning become an active research topic. Most studies on semi-supervised clustering assume that the number of classes is equal to the number of clusters. This paper introduces a semi-supervised clustering algorithm, named Multiple Clusters per Class k-means (MCCK), which estimates the number of clusters per class via pair wise constraints generated from class labels. Experiments with eight datasets indicate that the algorithm outperforms three traditional algorithms for semi-supervised clustering, especially when the one-cluster-per-class assumption does not hold. Finally, the learned structure can offer a valuable description of the data in several applications. For instance, it can aid the identification of subtypes of diseases in medical diagnosis problems.

Published in:

2012 Brazilian Symposium on Neural Networks

Date of Conference:

20-25 Oct. 2012