Skip to Main Content
Eukaryotic secretory proteins that traverse classical ER-Golgi pathway are usually characterized by short N-terminal signal peptides. However, several secretory proteins lacking the signal peptides are found to be exported by a non-classical secretion pathway. Therefore, predicting non-classical secretory proteins regardless of the N-terminal signal peptides is necessary for developing a critical computational approach. Several prediction methods have been proposed by using various types of features to predict secretory proteins. However, prediction performance seems not acceptable. This study proposes an SVM-based prediction method, namely ProSec-iGOX, which uses a major set of informative Gene Ontology (GO) terms and a minor set of assistance features. Physicochemical properties as the assistance features are useful when a query protein sequence without homologous protein with annotated GO terms. Two data sets, S25 and S40, having the identity 25% and 40%, respectively, are adopted for performance comparisons. The ProSec-iGOX yields test accuracies of 95.1% and 96.8% when adopting on the data sets S25 and S40 respectively. The latter accuracy (96.8%) is significantly higher than that of SPRED (82.2%), which uses frequency of tri-peptides and short peptides, secondary structure, physicochemical properties as input features to a random forest classifier. The experimental results show that GO terms are effective features for predicting non-classical secretory proteins.
Computer Science and Automation Engineering (CSAE), 2011 IEEE International Conference on (Volume:4 )
Date of Conference: 10-12 June 2011