Skip to Main Content
In recent years, many studies have shown that microarray gene expression data is useful for disease identification and cancer classification. However, since gene expression data may contain thousands of genes simultaneously, successful microarray classification can be rather difficult. Feature (gene) selection is a frequently used pre-processing technology for successful classification of microarray gene expression data. Selecting a useful gene subset as a classifier not only decreases the computational time and cost, but also increases the classification accuracy. It is therefore imperative to extract only a small number of genes, which are exclusively relevant for the classification of a particular cancer/disease type. In this paper, correlation-based binary particle swarm optimizations is proposed to select the relevant genes, and a K-nearest neighbor with the leave-one-out cross-validation method serves as a classifier to evaluate the classification performance on six published cancer classification data sets. The experimental results show that the proposed method selects fewer gene subsets, while still resulting in higher prediction accuracy than the other literature methods.