Skip to Main Content
In unsupervised learning paradigm, we are not given class labels, which features should we keep? Unsupervised feature selection method well solves this problem and has got a good effect in features selection with unlabeled data. Laplacian Score (LS) is a newly proposed unsupervised feature selection algorithm. However it uses k-means clustering method to select the top k features, therefore, the disadvantages of k-means clustering method greatly affect the result and increases the complexity of LS. In this paper, we introduce a novel algorithm called LSE (Laplacian Score combined with distance-based entropy measure) for automatically selecting subset of features. LSE uses distance-based entropy to replace the k-means clustering method in LS, which intrinsically solves the drawbacks of LS and contribute to the stability and efficiency of LSE. We compare LSE with LS on six UCI data sets. Experimental results demonstrate LSE can outperform LS on stability and efficiency, especially when processing high dimension datasets.