Skip to Main Content
In this study the methodology of analyzing the data produced from Genome Wide Association Studies (GWAs) is presented, using appropriate techniques and software packages. The analytical process is applied to Rheumatoid Arthritis (RA) and Multiple Sclerosis (MS) data sets produced by experimental microarray assays. The purpose of the process aims to the filtering of the datasets in order to derive and define statistically significant genetic variants associated with each one of the diseases. Furthermore, the analysis is expanded using the Multifactor Dimensionality Reduction (MDR) data mining strategy. Using MDR we are reducing the dimensionality of the genotyped predictive factors from n-dimensions to one-dimension by characterizing the genotypes of the different locus as high-risk and low-risk. In this way, we can provide risk predictive models about the diseases' susceptibility and the methodology is further extended by adding clinical and environmental factors to the RA and MS genomic loci data. The methodology presented in this study aims to transform the analytical genomic datasets produced by microarray experiments to a knowledge modeling framework constituted of prognostic/diagnostic decision rules and risk predictive factors. Finally, the evaluation methodology of the MDR data mining strategy is presented.