Skip to Main Content
Variable selection is the problem of choosing the subset of explanatory variables for a regression or classification model such that the resulting model is best according to some criterion. Here we consider the use of population-based incremental learning (PBIL) to select the variables for a linear regression model to predict a quantitative trait in living organisms. The data here is simulated to represent a genome-wide association study (GWAS) using single nucleotide polymorphisms (SNPs) as explanatory variables and height as an example trait. PBIL was effective in optimizing a variety of model fitness criteria. The resulting models were found to have true positive and false negative rates comparable to those of competing methods.
Evolutionary Computation (CEC), 2012 IEEE Congress on
Date of Conference: 10-15 June 2012