In biology and medical sciences, highly parallel biological assays spurred a revolution leading to the emergence of the '-omics' era. Dimensionality reduction techniques are necessary to be able to analyze, interpret, validate and take advantage of the tremendous wealth of highly dimensional data they provide. This paper is based on a DNA microarray study providing gene signatures for hypoxia. These gene signatures were tested on a large breast cancer data set for assessing their prognostic power by means of Kaplan-Meier survival, univariate, and multivariate analyses. We explore the use of several mathematical programming-based techniques that aim to reduce the gene signature sizes as much as possible while maintaining the key characteristics of the original signature, more precisely: the signature prognostic and diagnostic significance. The proposed signature reduction techniques have very interesting potential uses. Indeed, by downsizing the relevant data to a manageable size, one can then patent the core set of biomarkers and also create a dedicated assay (e.g.: on a customized array) for routine applications (e.g.: in the clinical set up) leading to individualized medicine capabilities. Our experiments show that the reduced hypoxia signatures reproduced qualitatively and quantitatively in a similar way that of the original ones.
Published in:
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on
Date of Conference: 13-15 Dec. 2007