Skip to Main Content
The classification of protein sequences into families is an important tool in the annotation of structural and functional properties to newly discovered proteins. We present a classification system using pattern recognition techniques to create a numerical vector representation of a protein sequence and then classify the sequence into a number of given families. We introduce the use of fuzzy ARTMAP classifiers and show that coupled with the genetic algorithm based feature subset selection, the system is able to classify protein sequences with an accuracy of 93%. This accuracy is compared with numerous other classification tools and demonstrates that the fuzzy ARTMAP is suitable due to its high accuracy, quick training times and ability for incremental learning.