Skip to Main Content
Inferring protein functions from different data sources is a challenging task in the post-genomic era, as a large number of crude protein structures from structural genomics project are now solved without their biochemical functions characterized. Recently, many different methods have been used to predict protein functions including those based on Protein-Protein Interaction (PPI), structure, sequence relationship, gene expression data, etc. Among these approaches, methods based on protein interaction data are very promising. In this paper, we studied a network-based method using locally linear embedding (LLE). LLE is a robust learning algorithm that manipulates dimensionality reduction, neighborhood-preserving embedding for high-dimensional data. We first embed both annotated and unannotated proteins in a low dimensional Euclidean space; then, we apply semi-supervised learning techniques to classify unannotated proteins into different functional groups. Finally, we made predictions to the unknown functional proteins in yeast. 5-fold cross validation is then applied to the GO terms to compare the performance of different approaches, and the proposed method performs significantly better than the others.