By Topic

Simultaneous feature selection and clustering for categorical features using multi objective genetic algorithm

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Dipankar Dutta ; Department of Computer Science and Information Technology, University Institute of Technology, The University of Burdwan, Golapbug (North), West Bengal, India PIN-713104 ; Paramartha Dutta ; Jaya Sil

Clustering is unsupervised learning where ideally class levels and number of clusters (K) are not known. K-clustering can be categorized as semi-supervised learning where K is known. Here we have considered K-Clustering with simultaneous feature selection. Feature subset selection helps to identify relevant features for clustering, increase understandability, better scalability and improve accuracy. Here we have used two measures, intra-cluster distance (Homogeneity, H) and inter-cluster distances (Separation, S) for clustering. Measures are using mod distance per feature suitable for categorical features (attributes). Rather than combining H and S to frame the problem as single objective optimization problem, we use multi objective genetic algorithm (MOGA) to find out diverse solutions near to Pareto optimal front in the two-dimensional objective space. Each evolved solution represents a set of cluster modes (CMs) build by selected feature subset. Here, K-modes is hybridized with MOGA. We have used hybridized GA to combine global searching powers of GA with local searching powers of K-modes. Considering context sensitivity, we have used a special crossover operator called “pairwise crossover” and “substitution”. The main contribution of this paper is simultaneous dimensionality reduction and optimization of objectives using MOGA. Results on 3 benchmark data sets from UCI Machine Learning Repository containing categorical features shows the superiority of the algorithm.

Published in:

Hybrid Intelligent Systems (HIS), 2012 12th International Conference on

Date of Conference:

4-7 Dec. 2012