Conferences >Proceedings 20th IEEE Interna...

Phylogenetic models of rate heterogeneity: a high performance computing perspective

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Inference of phylogenetic trees using the maximum likelihood (ML) method is NP-hard. Furthermore, the computation of the likelihood function for huge trees of more than 1...Show More

Metadata

Abstract:

Inference of phylogenetic trees using the maximum likelihood (ML) method is NP-hard. Furthermore, the computation of the likelihood function for huge trees of more than 1,000 organisms is computationally intensive due to a large amount of floating point operations and high memory consumption. Within this context, the present paper compares two competing mathematical models that account for evolutionary rate heterogeneity: the Gamma and CAT models. The intention of this paper is to show that - from a purely empirical point of view - CAT can be used instead of Gamma. The main advantage of CAT over Gamma consists in significantly lower memory consumption and faster inference times. An experimental study using RAxML has been performed on 19 real-world datasets comprising 73 up to 1,663 DNA sequences. Results show that CAT is on average 5.5 times faster than Gamma and - surprisingly enough - also yields trees with slightly superior Gamma likelihood values. The usage of the CAT model decreases the amount of average L2 and L3 cache misses by factor 8.55

Published in: Proceedings 20th IEEE International Parallel & Distributed Processing Symposium

Date of Conference: 25-29 April 2006

Date Added to IEEE Xplore: 26 June 2006

Print ISBN:1-4244-0054-6

Print ISSN: 1530-2075

DOI: 10.1109/IPDPS.2006.1639535

Conference Location: Rhodes, Greece

Contents

1. Introduction

Phylogenetic trees are used to represent the evolutionary history of a set of organisms (also called taxa). A multiple alignment of a small region of their DNA or protein sequences can be used as input for the computation of phylogenies. In a computational context phylogenetic trees are usually strictly bifurcating unrooted trees. The organisms of the alignment are located at the tips and the inner nodes represent extinct common ancestors. The branches of the tree represent the time which was required for the mutation of one species into another-new-one. The inference of phylogenies with computational methods has many important applications in medical and biological research, such as e.g. drug discovery and conservation biology (see [1] for a summary). Due to the rapid growth of available sequence data and the constant improvement of multiple alignment methods it has now become feasible to compute large trees which comprise more than 1,000 organisms. The computation of the tree-of-life containing representatives of all living beings on earth is one of the grand challenges in Bioinformatics.

References is not available for this document.

Phylogenetic models of rate heterogeneity: a high performance computing perspective

Abstract:

Metadata

Abstract:

1. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Phylogenetic models of rate heterogeneity: a high performance computing perspective

Alerts

Abstract:

Metadata

Abstract:

1. Introduction

Authors

Figures

References

Citations

Keywords

Metrics

References

IEEE Account

Purchase Details

Profile Information

Need Help?