Skip to Main Content
Discovering the genetic basis of common human diseases will be assisted by large-scale association studies with a large number of individuals and genetic markers, such as single-nucleotide polymorphisms (SNPs). The potential size of the data and the resulting model space require the development of efficient methodology to unravel associations between epidemiological outcomes and SNPs in dense genetic maps. We apply an evolutionary algorithm (EA) to construct models consisting of logic trees. These trees are Boolean expressions involving nodes that contain strings of SNPs in high linkage disequilibrium (LD), that is, SNPs that are highly correlated with each other. At each generation of the algorithm, a population of logic tree models is modified using selection, crossover, and mutation moves. Logic trees are selected for the next generation using a fitness function based on the marginal likelihood in a Bayesian regression framework. Mutation and crossover moves use LD measures to propose changes to the trees, and facilitate the movement through the model space. We demonstrate our method on data from a candidate gene study of quantitative genetic variation.