Skip to Main Content
The right-handed single-stranded beta-helix proteins characterized as virulence factors, allergens and toxins are threat to human health. Identification of these proteins from primary sequence is of great importance in bio-medicine and medical microbiology. In this paper, support vector machine (SVM) has been used to predict the presence of beta-helix fold in protein sequences using dipeptide composition. Input vector of 400 dimensions is used to search for the presence of conserved secondary structure called rungs in beta-helix proteins. A maximum accuracy of 90.1% and Matthew's correlation coefficient of 0.77 is obtained in a 5-fold cross-validation procedure. In addition, a position specific scoring matrix (PSSM) is also used to score putative rung sequences identified by SVM. Finally, the predicted beta-helix proteins are threaded against a custom beta-helix template library to achieve high prediction confidence. The method recognizes right-handed beta-helices with 100% sensitivity and 99.8% specificity on a test set of known protein structures.