Skip to Main Content
Classification analysis of gene expression data could lead to knowledge of gene functions and diseases mechanisms. However, the data involve nonlinear interactions among genes and environmental factors. Worst yet, while the data are usually of high dimensions, the sample sizes acquirable are generally relatively small, resulting in the well known difficulty - the curse of dimensionality - in the classification task. This work describes how gene expression data can be analyzed using Locality Preserving Projections (LPP) manifold learning method. LPP is a dimensionality reduction strategy for feature selection and visualization. Using LPP, the high dimensional gene expression data are mapped to a low dimensional subspace for data analysis. LPP finds the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. Not only does it share many convenient data-representation properties of the nonlinear techniques like Laplacian Eigenmaps or Locally Linear Embedding, it is also linear and more crucially is defined everywhere in the ambient space rather than just on the training data points. Comparative experimental results with PC A, LDA, LLE, etc. on different gene expression datasets show that the LPP-based method has the potential of being more efficient for complex gene expression data classification.