Skip to Main Content
As the amount of sequencing efforts and genomic data volume continue to increase at an accelerated rate, phylogenetic analysis provides an evolutionary context for understanding and interpreting this growing set of complex data. We introduce a novel quartet based method for inferring molecular based phylogeny called hypercleaning* (HC*). The HC* method is based on the hypercleaning (HC) technique, which possesses an interesting property of recovering edges (of a phylogenetic tree) that are best supported by the witness quartet set. HC* extends HC in two regards: i) whereas HC constrains the input quartet set to be unweighted (binary valued), HC* allows any positive valued quartet scores, enabling more informative quartets to be defined. ii) HC* employs a novel collapsing technique which significantly speeds up the inference stage, making it empirically on par with quartet puzzling in terms of speed, while still guaranteeing optimal edge recovery as in HC. This paper is primarily aimed at presenting the algorithmic construction of HC*. We also report some preliminary studies on an implementation of HC* as a potentially powerful approximation scheme for maximum likelihood based inference.