By Topic

Weighted pooling high-throughput gene expression data sets to maximize the functional coherence of the top rank genes

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Xiaodong Zhou ; Dept. of Anatomy and Neurobiology, University of Tennessee Health Science Center, Dept. of Computer Science, University of Memphis, Memphis, USA ; E. Olusegun George

In a typical gene expression study with high throughput technique, such as microarray, a biologist usually focuses on the top genes ranked by the P-values to establish gene functional relationship / network, biological pathway, and microbiologically ramifications of the gene's selection. With more datasets publically available, researchers pool data from independent experiments, typically by pooling P-values with equal weight assigned to each dataset, aiming to fetch more biological information from the pooled data. However, the qualities of datasets may vary substantially. Assigning equal weights may not guarantee the optimal result. Applying the equal weights approach to six independent datasets, we observe the top rank genes of data pooled with this approach have less functional coherence than the single dataset that has highest functional coherence. We propose a procedure based on enhanced simulated annealing (ESA) and literature semantic indexing cohesive (LSI-c) analysis to assign optimal weights to datasets so as to maximize the functional coherence of the top rank genes ordered by their pooled P-values. We observe significantly more functional coherence in optimally pooled data than any single dataset or data pooled with equal weights. Identification of top rank genes through our optimal procedure should improve the downstream analysis.

Published in:

Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on

Date of Conference:

12-15 Nov. 2011