By Topic

High performance implementation of planted motif problem using suffix trees

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Dasari, N.S. ; Old Dominion Univ., Norfolk, VA, USA ; Ranjan, D. ; Zubair, M.

In this paper we present a high performance implementation of suffix tree based solution to the planted motif problem on two different parallel architectures: NVIDIA GPU and Intel Multicore machines. An (I,d) planted motif problem(PMP) is defined as: Given a sequence of n DNA sequences, each of length L, find M, the set of sequences (or motifs) of length I which have at least one d-neighbor in each of the n sequences. Here, a d-neighbor of a sequence is a sequence of same length that differs in at-most d positions. PMP is a well studied problem in computational biology. It is useful in developing methods for finding transcription factor binding sites, sequence classification and for building phylogenetic trees. The problem is computationally challenging to solve, for example a (19,7) PMP takes 9.9 hours on a sequential machine. Many approaches to solve planted motif problem can be found in literature. One approach is based on use of suffix tree data structure. Though suffix tree based methods are the most efficient ones for solving large planted motif problems on sequential machines, they are quite difficult to parallelize. We present suffix tree based parallel solutions for PMP on NVIDIA GPU and Intel Multicore architectures that are efficient and scalable. The solutions are based on a suffix tree algorithm previously presented but use extensive adaptation to individual architectures to ensure that the implementations work efficiently and scale well.

Published in:

High Performance Computing and Simulation (HPCS), 2011 International Conference on

Date of Conference:

4-8 July 2011