Protein sequence motifs are gathering more and more attention in the sequence analysis area. These recurring regions have the potential to determine protein 's conformation, function and activities. In our previous work, we tried to obtain protein sequence motifs which are universally conserved across protein family boundaries. Therefore, unlike most popular motif discovering algorithms, our input dataset is extremely large. In order to deal with large input datasets, we provided two granular computing models (FIK and FGK model) to efficiently generate protein motifs information and Super GSVM-FE model to do the feature elimination for improving the quality of motif information. In this article, we tried to further improve our SVM feature elimination model to achieve three goals: Reduce time execution by half, further improve motif information quality and add the ability of adjusting the number of filtered segments. Compared with the latest results, our new approach shows great improvements.
Published in:
Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
Date of Conference: 14-17 Oct. 2007