By Topic

Clustering high dimensional gene expression data via two step feature filtering

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)

Due to the importance of gene expression data in cancer diagnosis and treatment, microarray gene expression data have attracted more and more attentions from cancer researchers in recent years. However, in real-world computational analysis, such data common meet with the curse of dimensionality due to the tens of thousands of measures of gene expression level versus the small number of samples. therefore, developing effective clustering method is a challenging problem for high dimensional dataset. Here, we use two step feature filtering and dimensional reduction methods to reduce the dimension of gene expression data. At first, we extract a subset of genes based on ReliefF and Fast Correlation-Based Filter (FCBF). Then, the clustering approach of k-means (KM), KM with principal component analysis (PCA), KM with random projection (RP), respectively is implemented on the reduced gene dataset and generates the resulting data of clusters of cancer samples. Experimental results on the small round blue-cell tumor (SRBCT) data set demonstrate that two step feature filtering can significantly improve the performance of KM clustering algorithm and contribute to the application of PCA and RP in high dimensional space and that the effectiveness and efficiency of our proposed scheme in addressing high dimensional gene expression data.

Published in:

Computer Sciences and Convergence Information Technology (ICCIT), 2011 6th International Conference on

Date of Conference:

Nov. 29 2011-Dec. 1 2011