By Topic

A Population-Based Incremental Learning approach to microarray gene expression feature selection

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Perez, M. ; Dept. of Electr. & Electron. Eng. Technol., Univ. of Johannesburg, Johannesburg, South Africa ; Rubin, D.M. ; Marwala, T. ; Scott, L.E.
more authors

The identification of a differentially expressed set of genes in microarray data analysis is essential, both for novel onco-genic pathway identification, as well as for automated diagnostic purposes. This paper assesses the effectiveness of the Population-Based Incremental Learning (PBIL) algorithm in identifying a class differentiating gene set for sample classification. PBIL is based on iteratively evolving the genome of a search population by updating a probability vector, guided by the extent of class-separability demonstrated by a combination of features. PBIL is compared, both to standard Genetic Algorithm (GA), as well as to an Analysis of Variance (ANOVA). The algorithms are tested on a publically available three-class leukaemia microarray data set (n=72). After running 30 repeats of both GA and PBIL, PBIL was able to find an average feature-space separability of 97.04%, while GA achieved an average class-separability of 96.39%. PBIL also found smaller feature-spaces than GA, (PBIL - 326 genes and GA - 2652) thus excluding a large percentage of redundant features. It also, on average, outperformed the ANOVA approach for n = 2652 (91.62%), q <; 0.05 (94.44%), q <; 0.01 (93.06%) and q <; 0.005 (95.83%). The best PBIL run (98.61%) even outperformed ANOVA for n = 326 and q <; 0.001 (both 97.22%). PBIL's performance is ascribed to its ability to direct the search, not only towards the optimal solution, but also away from the worst.

Published in:

Electrical and Electronics Engineers in Israel (IEEEI), 2010 IEEE 26th Convention of

Date of Conference:

17-20 Nov. 2010