Skip to Main Content
As gene order evolves through a variety of chromosomal rearrangements, conserved segments provide important insight into evolutionary relationships and functional roles of genes. However, gene loss within otherwise conserved segments, as typically occurs following large-scale genome duplication, has received limited algorithmic study. This has been a major impediment to comparative genomics in certain taxa, such as plants and fish. We propose a heuristic algorithm/or the inference of ancestral gene order in a set of related genomes that have undergone large-scale duplication and gene loss. First, approximately conserved (i.e. homologous) segments are identified using pairwise local genome alignment. Second, homologous segments are iteratively clustered under the control of two parameters, (1) the minimal required number of shared genes between two clusters and (2) the maximal allowed number of rearrangement breakpoints along the lineage leading to each descendant segment. Finally, we compute an estimated ancestral gene order for each cluster that is optimal in some sense. We evaluate the performance of this algorithm on simulated data that models a genome evolving by large-scale duplication, duplicate gene loss, transposition, translocation, and inversion. The results suggest that long segments of ancestral gene order may be reconstructed following moderate levels of rearrangement with only minor loss of accuracy.