Skip to Main Content
Traditional signal processing methods cannot detect interspersed repeats and generally cannot handle nonstationary signals. In this paper, we propose a new method for periodicity detection in protein sequences to locate interspersed repeats. We first apply the autoregressive model with a sliding window to find possible repeating subsequences within a protein sequence. Then, we utilize an iterative hidden Markov model (HMM) to count the number of subsequences similar to each of the possible repeating subsequences. An iterative HMM search of the potential repeating subsequences can help identify interspersed repeats. Finally, the numbers of repeating subsequences are aggregated together as a feature and used in the classification process. Experiment results show that our method improves the performance of solenoid protein recognition substantially.