By Topic

Comparison of hybrid feature selection models on gene expression data

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Patharawut Saengsiri ; Department of Information Technology, Faculty of Information Technology, KMUTNB, Bangkok, Thailand ; Phayung Meesad ; Sageemas Na Wichian ; Unger Herwig

Microarray data contains thousands of genes which are used to evaluate expression level. However, most of them are not associated with cancer diseases and leads to the curse of dimensionality. The challenge based on microarray data is feature selection which searches for subsets of informative genes. At the moment, these techniques focus on filter and wrapper approaches to discover subsets of genes. Filter approach is better than wrapper approach in terms of time consuming. On the contrary, the accuracy of wrapper approach is higher than that of filter approach. However, it is more beneficial to reduce the time process and increase accuracy simultaneously when searching for subsets of genes. Thus, this paper proposes comparison of hybrid feature selection models on gene expression datasets, this consists of four steps 1) filter subgroup of gene using Correlation based Feature Selection (CFS), Gain Ratio (GR), and Information Gain (INFO) 2) transfers output of each filter method into a wrapper approach that's based on the Support Vector Machine (SVM) classifier and two heuristic searches which are Greedy Search (GS) and Genetic Algorithm (GA) 3) generate hybrid feature selection model CFSSVMGA, CSFSVMGS, GRSVMGA, GRSVMGS, INFOSVMGA, and INFOSVMGS 4) performance comparison using precision, recall, F-measure, and accuracy rate. Results from the experiment concluded the CFSSVMGA model outperformed other models on three public gene expression datasets.

Published in:

2010 Eighth International Conference on ICT and Knowledge Engineering

Date of Conference:

24-25 Nov. 2010