Skip to Main Content
Next Generation Sequence (NGS) assemblers are challenged with the problem of handling massive number of reads. Bi-directed de Bruijn graph is the most fundamental data structure on which numerous NGS assemblers have been built (e.g. Velvet, ABySS). Most of these assemblers only differ in the heuristics which they employ to operate on this de Bruijn graph. These heuristics are composed of several fundamental operations such as construction, compaction and pruning of the underlying bi-directed de Bruijn graph. Unfortunately the current algorithms to accomplish these fundamental operations on the de Bruijn graph are computationally inefficient and have become a bottleneck to scale the NGS assemblers. In this talk, some of the recent results which provide computationally efficient algorithms to these fundamental bi-directed de Bruijn graph operations are discussed. The algorithms are based on sorting and efficient in sequential, out of-core, and parallel settings.