By Topic

Prediction of Protein Folds: Extraction of New Features, Dimensionality Reduction, and Fusion of Heterogeneous Classifiers

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Ghanty, P. ; Praxis Softek Solutions Pvt. Ltd., Kolkata ; Pal, N.R.

Here, we consider a two-level (four classes in level 1 and 27 folds in level 2) protein fold determination problem. We propose several new features and use some existing features including frequencies of adjacent residues, frequencies of residues separated by one residue, and triplets (trio) of amino acid compositions (AACs). The dimensionality of the trio AAC features is drastically reduced using a neural network based novel online feature selection scheme. We also propose new sets of features called trio potential computed using the hydrophobicity values considering only the selected trio AACs. We demonstrate that the proposed features including the selected trio AACs and trio potential have good discriminating power for protein fold determination. As machine learning tools, we use multilayer perceptron network, radial basis function network, and support vector machine. To improve the recognition accuracies further, we use fusion of different classifiers using the same set of features as well as different sets of features. The effectiveness of our schemes is demonstrated with a benchmark structural classification of proteins (SCOP) dataset. Our system achieves 84.9% test accuracy for the SCOP structural class (four classes) determination and 68.6% test accuracy for the fold recognition with 27 folds. In order to demonstrate the consistency of feature sets and fusion schemes, we also perform the fivefold cross-validation experiments.

Published in:

NanoBioscience, IEEE Transactions on  (Volume:8 ,  Issue: 1 )