High-dimensional data clustering is an open problem in modern data mining. This paper proposed a new genetic algorithm-based feature selection for high-dimensional data clustering, called GA-FSFclustering. This approach searches effective feature subsets for clustering in all features by genetic algorithm. The candidate features and cluster centers are real number encoded. A new criterion for evaluating feature subsets is employed as the fitness function. The experimental results indicate the feasibility and efficiency of the GA-FSFclustering algorithm.
Published in:
Genetic and Evolutionary Computing, 2009. WGEC '09. 3rd International Conference on
Date of Conference: 14-17 Oct. 2009