By Topic

On DNA Numerical Representations for Period-3 Based Exon Prediction

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Mahmood Akhtar ; The University of New South Wales, Sydney 2052, Australia. ; Julien Epps ; Eliathamby Ambikairajah

Processing of DNA sequences using traditional digital signal processing methods requires their conversion from a character string into numerical sequences as a first step. Many representations introduced previously assign values to indicate the four DNA nucleotides A, C, G, and T that impose mathematical structures not present in the actual DNA sequence. In this paper, almost all existing methods are compared for the purpose of identifying protein coding regions, using the discrete Fourier transform (DFT) based spectral content measure to exploit period-3 behaviour in the exonic regions for the GENSCAN test set. False positive vs. sensitivity, receiver operating characteristic (ROC) curve and exonic nucleotides detected as false positive results all show that the two newly proposed numerical of DNA representations perform better than the well-known Z-curve, tetrahedron, and Voss representations, with 66-75% less processing. By comparison with Voss representation, the proposed paired numeric method can produce relative improvements of up to 12% in terms of prediction accuracy of exonic nucleotides at a 10% false positive rate using the GENSCAN test set.

Published in:

2007 IEEE International Workshop on Genomic Signal Processing and Statistics

Date of Conference:

10-12 June 2007