Skip to Main Content
The parallel monitoring of the expression profiles of thousands of genes seems particularly promising for a deeper understanding of cancer biology and to identify molecular signatures supporting the histological classification schemes of neoplastic specimens. A computational procedure for feature extraction and classification of gene expression data through the application of principal component analysis and of the soft independent modeling of class analogy approach (SIMCA) is described. The identified features contain critical information about gene-phenotype relationships observed during changes in cell physiology. They represent a rational and dimensionally reduced base for understanding the basic biology of the onset of diseases, defining targets of therapeutic intervention, and developing diagnostic tools for the identification and classification of pathological states. The proposed method has been tested on the childhood round blue cell tumors study presented by Khan et al. . The analysis of the SIMCA model allows the identification of specific phenotype markers and provides the assignment to multiple classes for previously unseen instances.