Skip to Main Content
It is widely recognized that the information from the amino acid sequence can serve as crucial pointers in predicting subcellular location of proteins. We introduce a new feature vector for predicting proteins targeted to various compartments in the intracellular and secretory pathway from protein sequence. Features are based on the global Composition, Transition and Distribution (CTD) of amino acid attributes such as hydrophobicity, normalized van der Waals volume, polarity, polarizability, charge, secondary structure and solvent accessibility. Sequences are considered in three equal parts and the features are extracted separately for all the three parts. Based on the feature vectors, we have trained a Support Vector Machine to classify intracellular and secretory proteins. Our method gives an accuracy of 92% in human, 88% in plant and 95% in fungi with independent dataset at root level of the protein sorting pathway.