Skip to Main Content
Membrane proteins are fundamental elements of a cell that play essential roles nearly in all the cellular processes. Prediction of membrane protein types using biological experiments are often complicated and time consuming. Therefore it is highly desirable to develop a robust, reliable and high-throughput silico method to predict membrane protein types. In this study, the authors have used two feature extraction strategies known as dipeptide and pseudo amino acid (PseAA) compositions for classification of membrane proteins types. In addition, a composite model is also developed by concatenating dipeptide and PseAA composition based features. Further, two feature selection methods such as neighbourhood preserving embedding and locally linear embedding (LLE) are applied to reduce the dimensionality of the composite model. The performance of these feature extraction strategies is evaluated using four different classifiers: K-nearest neighbour, probabilistic neural network (PNN), support vector machine (SVM) and grey incidence degree. The highest success rates have been observed using the LLE-based reduced features. SVM has yielded the best accuracy of 88.2% in case of jackknife test. Although in case of independent dataset test, PNN has obtained the highest accuracy of 98.4%. Performance measures other than accuracy are also used such as 'Mathew correlation coefficient', sensitivity and precision. The authors simulated results show that the composite model has significantly discriminated the types of membrane protein and might be useful for future research and drug discovery.