Skip to Main Content
Protein sequence motifs information is crucial to the analysis of biologically significant regions. The conserved regions have the potential to determine the role of the proteins. Many algorithms or techniques to discover motifs require a predefined fixed window size in advance. Due to the fixed size, these approaches often deliver a number of similar motifs simply shifted by some bases or including mismatches. To confront the mismatched motifs problem, we use the super-rule concept to construct a Super-Rule-Tree (SRT) by a modified HHK clustering which requires no parameter setup to identify the similarities and dissimilarities between the motifs. By analyzing the motifs results generated by our approach, they are not only significant in sequence area but secondary structure similarity. We believe new proposed HHK clustering algorithm and SRT can play an important role in similar researches which requires predefined fixed window size.