Skip to Main Content
With the exponential growth of genomic sequences, there is an increasing demand to accurately identify protein coding regions from genomic sequences. Despite many progresses being made in the identification of protein coding regions by computational methods during recent years, the performances and efficiencies of the prediction methods still need to be improved. A novel method to predict the position of coding regions is proposed. First, a support vector machine is used as a classifier to recognize the first nucleotide of a codon in a coding region. Then, according to the difference of the time frequency characteristics of the output values of the classifier analyzed by short time Fourier transform, the position of coding regions can be accurately determinate. The algorithm is not only can predict coding regions, but also can identify the first nucleotide of the codon in coding regions. This is very significant for accurate translation into a protein sequence. The simulation results show the proposed method is more effective for coding regions prediction than the existing coding region discovery tools.
Date of Conference: 15-16 May 2009