By Topic

Fast and parallelized greedy forward selection of genetic variants in Genome-wide association studies

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Okser, S. ; Turku Centre for Comput. Sci., Univ. of Turku, Turku, Finland ; Pahikkala, T. ; Airola, A. ; Aittokallio, T.
more authors

We present the application of a regularized least-squares based algorithm, known as greedy RLS, to perform a wrapper-based feature selection on an entire genome-wide association dataset. Wrapper methods were previously thought to be computationally infeasible on these types of studies. The running time of the method grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. Moreover, we show how it can be further accelerated using parallel computation on multi-core processors. We tested the method on the Wellcome Trust Case Control Consortium's (WTCCC) Type 2 Diabetes - UK National Blood Service dataset consisting of 3,382 subjects and 404,569 single nucleotide polymorphisms (SNPs). Our method is capable of high-speed feature selection, selecting the top 100 predictive SNPs in under five minutes on a high end desktop and outperforms typical filter approaches in terms of predictive performance.

Published in:

Genomic Signal Processing and Statistics (GENSIPS), 2011 IEEE International Workshop on

Date of Conference:

4-6 Dec. 2011