Skip to Main Content
DNA microarrays technology enables us to obtain information about expression levels of thousands of genes at the same time. This technology promises to monitor the whole genome on a single chip so that researchers can have a better picture of the interactions among thousands of genes at the same time. It becomes a challenge to extract information from the large amount of data through data mining. One important application of gene expression microarray data is cancer classification. However, gene expression data collected for cancer classification usually has the property of the number of genes far exceeding the number of samples. Work in such a high dimensional space is extremely difficult. Previous researches have used two-stage classification method to deal with the gene expression data. Such approaches select genes to reduce problem dimension in the first stage and classify tumors in the second stage. In the study, the ant colony optimization (ACO) algorithm is introduced to select genes relevant to cancers first, then the multi-layer perceptrons (MLP) neural network and support vector machine (SVM) classifiers are used for cancer classification. Experimental results show that selecting genes by using ACO algorithm can improve the accuracy of BP and SVM classifiers. The optimal number of genes selected for cancer classification should be set according to both the microarray dataset and gene selection methods.