Loading web-font TeX/Math/Italic
A Heuristic for Maximum Greedy Consensus Tree Problem | IEEE Conference Publication | IEEE Xplore

A Heuristic for Maximum Greedy Consensus Tree Problem


Abstract:

A phylogenetic tree is a tree that represents the evolutionary history of a set of species. Given a set of k conflicting phylogenetic trees whose leaves are labeled by n ...Show More

Abstract:

A phylogenetic tree is a tree that represents the evolutionary history of a set of species. Given a set of k conflicting phylogenetic trees whose leaves are labeled by n species, a greedy consensus tree is a tree formed by choosing clusters according to their decreasing order of counts. The maximum greedy consensus tree (MGCT) problem asks to find the greedy consensus tree with the maximum number of internal nodes. It is known that the MGCT problem is NP-hard for k\geq 3. In this paper, we have proposed and implemented a heuristic that runs in O(k^{3}n^{5})-time. The experimental result using our dataset shows that the heuristic constructs a greedy consensus tree whose size is 23.4/26 of the binary tree. We also identified a class of phylogenetic trees where our algorithm performs better than a non-deterministic approach like random selection which breaks ties of cluster frequencies randomly.
Date of Conference: 21-23 December 2022
Date Added to IEEE Xplore: 04 April 2023
ISBN Information:

ISSN Information:

Conference Location: Dhaka, Bangladesh

I. Introduction

A phylogenetic tree is a rooted and unordered tree where each leaf is uniquely labeled. It represents the evolutionary history of a set of species, where each leaf represents an existing species and the internal nodes represent hypothetical ancestors [8]. Because phylogenies represent what has happened in the past, phylogenies cannot be directly observed but rather must be estimated. Consequently, various sophisticated statistical models of sequence evolution have been developed to estimate phylogenetic trees [5–7]. Because of rapid advancement in DNA sequencing, a large amount of data is now available for reconstructing phylogenetic trees. The consensus tree problem was proposed for summarizing the conflicts between a set of gene trees [1]. A consensus tree is a computationally formed tree that contains information about all the rival trees. Formally, the input of the consensus tree problem is a set of k gene trees, leaf-labeled by the same set of species L, where . The output is a consensus tree T which is leaf-labeled by L. The branching information of all k gene trees is summarized by this consensus tree. After the first proposal of the consensus tree problem by Adams [1], various consensus methods were proposed [2]. Our study focuses on the greedy consensus tree problem which is an extension of the majority rule consensus tree.

Contact IEEE to Subscribe

References

References is not available for this document.