Scheduled System Maintenance:
On May 6th, single article purchases and IEEE account management will be unavailable from 8:00 AM - 5:00 PM ET (12:00 - 21:00 UTC). We apologize for the inconvenience.
By Topic

Toward Better Understanding of Protein Secondary Structure: Extracting Prediction Rules

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Nguyen, M.N. ; Bioinf. Inst., Singapore, Singapore ; Zurada, J.M. ; Rajapakse, J.C.

Although numerous computational techniques have been applied to predict protein secondary structure (PSS), only limited studies have dealt with discovery of logic rules underlying the prediction itself. Such rules offer interesting links between the prediction model and the underlying biology. In addition, they enhance interpretability of PSS prediction by providing a degree of transparency to the predicting model usually regarded as a black box. In this paper, we explore the generation and use of C4.5 decision trees to extract relevant rules from PSS predictions modeled with two-stage support vector machines (TS-SVM). The proposed rules were derived on the RS126 data set of 126 nonhomologous globular proteins and on the PSIPRED data set of 1,923 protein sequences. Our approach has produced sets of comprehensible, and often interpretable, rules underlying the PSS predictions. Moreover, many of the rules seem to be strongly supported by biological evidence. Further, our approach resulted in good prediction accuracy, few and usually compact rules, and rules that are generally of higher confidence levels than those generated by other rule extraction techniques.

Published in:

Computational Biology and Bioinformatics, IEEE/ACM Transactions on  (Volume:8 ,  Issue: 3 )