Skip to Main Content
Protein sequence comparison is the most powerful tool for the inference of novel protein structure and function. This type of inference is commonly based on the similar sequence-similar structure-similar function paradigm, and derived by sequence similarity searching on databases of protein sequences. As entire genomes have been being determined at a rapid rate, computational methods for comparing protein sequences will be more essential for probing the complexity of molecular machines. In this paper we introduce a pattern-comparison algorithm, which is based on the mathematical concepts of linear predictive coding (LPC) and LPC cepstral distortion measure, for computing similarities/dissimilarities between protein sequences. Experimental results on a real data set of functionally related and functionally nonrelated protein sequences have shown the effectiveness of the proposed approach on both accuracy and computational efficiency.