By Topic

Parallel Large Scale Inference of Protein Domain Families

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Kahn, Daniel ; INRIA, Villeurbanne, France ; Rezvoy, C. ; Vivien, F.

The resolution of combinatorial assortments of protein sequences into domains is a prerequisite for protein sequence interpretation. However the recognition and clustering of homologous domains from sequence databases typically scales quadratically with respect to their size which grows exponentially, making it essential to parallelize these complex bioinformatics applications. Here we demonstrate the parallelization of MKDOM2, the sequential program that has been instrumental in the construction of the PRODOM database of protein domain families. This was challenging because of (1) dependencies between program iterations, (2) their extremely heterogeneous run times and (3) communication bottlenecks that could arise because of the large size of the data. A large scale test of the new program, MPI_MKDOM2, demonstrated its robustness against heterogeneous run times, preparing the grounds for future releases of PRODOM that would otherwise be out of reach with MKDOM2 by several orders of magnitude.

Published in:

Parallel and Distributed Systems, 2008. ICPADS '08. 14th IEEE International Conference on

Date of Conference:

8-10 Dec. 2008