Conferences >2017 IEEE 37th International ...

Parallelizing Big De Bruijn Graph Construction on Heterogeneous Processors

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

De Bruijn graph construction is the first step in de novo assemblers to connect input reads into a complete sequence without a reference genome. This step is both time an...Show More

Metadata

Abstract:

De Bruijn graph construction is the first step in de novo assemblers to connect input reads into a complete sequence without a reference genome. This step is both time and memory space consuming. To address this problem, we develop ParaHash, a system that partitions the input data in a compact format, parallelizes the computation on both the CPUs and the GPUs in a single computer, and performs hash-based De Bruijn graph construction. This way, ParaHash utilizes all available processors to assemble big genomes that cannot fit into memory. Furthermore, we analyze the characteristics of genome data to set the hash table size, design concurrent hashing algorithms to handle the inherent multiplicity, and pipeline the data transfer and the computation for further efficiency. Our experiments on real-world genome datasets show that the workload was balanced across heterogeneous processors, and that ParaHash was able to construct billion-node graphs on a single machine with an overall performance up to 20 times faster than the state-of-the-art shared-memory assemblers.

Published in: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)

Date of Conference: 05-08 June 2017

Date Added to IEEE Xplore: 17 July 2017

ISBN Information:

Print ISSN: 1063-6927

DOI: 10.1109/ICDCS.2017.250

Conference Location: Atlanta, GA, USA

Related Articles are not available for this document.

Contents

References is not available for this document.

Parallelizing Big De Bruijn Graph Construction on Heterogeneous Processors

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Parallelizing Big De Bruijn Graph Construction on Heterogeneous Processors

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?