Many problems in information processing involve some form of dimensionality reduction. In this paper, we propose a new model for feature evaluation and selection in unsupervised learning scenarios. The model makes no special assumptions on the nature of the data set. For each of the data set, the original features induce a ranking list of items in its k nearest neighbors. The evaluation criterion favors reduced features that result in the most consistent to these ranked lists. And an efficiently local descent search based on the model is adopted to select the reduced features. Our experiments with several data sets demonstrate that the proposed algorithm is able to detect completely irrelevant features and to remove some additional features without significantly hurting the performance of the clustering algorithm.
Published in:
Intelligent Computation Technology and Automation (ICICTA), 2010 International Conference on
(Volume:2
)
Date of Conference: 11-12 May 2010