By Topic

Splice Site Recognition in DNA Sequences Using K-mer Frequency Based Mapping for Support Vector Machine with Power Series Kernel

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Robertas Damaševicius ; Software Eng. Dept., Kaunas Univ. of Technol., Kaunas

Recognition of specific functionally-important DNA sequence fragments is considered one of the most important problems in bioinformatics. One type of such fragments is splice-junction (intron-exon or exon-intron) sites. Detection of splice-junction sites in DNA sequences is important for successful gene prediction. In this paper, support vector machine (SVM) is used for classification of DNA sequences and splice-site recognition. For optimal classification, four position-independent k-mer frequency based methods for mapping DNA sequences into SVM feature space are analyzed. Classification is performed using SVM power series kernels. Kernel parameters are optimized using a modification of the Nelder-Mead (downhill simplex) optimization method. Precision of classification is evaluated using F-measure, which is a combination of precision and recall metrics. Best classification results are achieved using 4-mers for exon-intron dataset (78%) and 6-mers for intron-exon dataset (70%) using 4-nucleotide frequencies.

Published in:

Complex, Intelligent and Software Intensive Systems, 2008. CISIS 2008. International Conference on

Date of Conference:

4-7 March 2008