Limits...
A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions.

Bader M, Abouelhoda MI, Ohlebusch E - BMC Bioinformatics (2008)

Bottom Line: Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available.However, this phylogenetic reconstruction is a very challenging task.For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Theoretical Computer Science, University of Ulm, Ulm, Germany. martin.bader@uni-ulm.de

ABSTRACT

Background: Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available. However, this phylogenetic reconstruction is a very challenging task. For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.

Results: In this paper, we present a new heuristic algorithm that directly constructs a phylogenetic tree w.r.t. the weighted reversal and transposition distance. Experimental results on previously published datasets show that constructing phylogenetic trees in this way results in better trees than constructing the trees w.r.t. the reversal distance, and recalculating the weight of the trees with the weighted reversal and transposition distance. An implementation of the algorithm can be obtained from the authors.

Conclusion: The possibility of creating phylogenetic trees directly w.r.t. the weighted reversal and transposition distance results in biologically more realistic scenarios. Our algorithm can solve today's most challenging biological datasets in a reasonable amount of time.

Show MeSH

Related in: MedlinePlus

Variance of the metazoan dataset. The results of our algorithm for the metazoan dataset over all runs as box plots. For each improvement strategy, also the average running time per run is provided. (a) contains the results for the weight ratio 1:2, (b) for the weight ratio 1:1.5, and (c) for the weight ratio 1:1.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2655097&req=5

Figure 4: Variance of the metazoan dataset. The results of our algorithm for the metazoan dataset over all runs as box plots. For each improvement strategy, also the average running time per run is provided. (a) contains the results for the weight ratio 1:2, (b) for the weight ratio 1:1.5, and (c) for the weight ratio 1:1.

Mentions: The best results for the metazoan dataset were achieved by phylo-m and phylo-tm, except for the weight ratio 1:1.5, where phylo-m found a tree with weight 119.5, while phylo-tm only found a tree with weight 120.5. While the algorithm was very fast for the weight ratio 1:2 (2:36 min per run for phylo-m and 6:13 for phylo-tm), the running times increased for the other weight ratios when the median solver was used. Nevertheless, the running times of about 1.5 hours per run are still feasible in practice. A comparison of our trees with those found by amGRP and MGR again showed that the direct usage of the weighted reversal and transposition distance is of advantage. Except for the weight ratio 1:2, for which amGRP found a better tree, the trees found by our program have a lower weight under the weighted reversal and transposition distance than the trees found by the other programs. Details can be found in Table 4. The variance of the different runs of our algorithm is depicted in Fig. 4.


A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions.

Bader M, Abouelhoda MI, Ohlebusch E - BMC Bioinformatics (2008)

Variance of the metazoan dataset. The results of our algorithm for the metazoan dataset over all runs as box plots. For each improvement strategy, also the average running time per run is provided. (a) contains the results for the weight ratio 1:2, (b) for the weight ratio 1:1.5, and (c) for the weight ratio 1:1.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2655097&req=5

Figure 4: Variance of the metazoan dataset. The results of our algorithm for the metazoan dataset over all runs as box plots. For each improvement strategy, also the average running time per run is provided. (a) contains the results for the weight ratio 1:2, (b) for the weight ratio 1:1.5, and (c) for the weight ratio 1:1.
Mentions: The best results for the metazoan dataset were achieved by phylo-m and phylo-tm, except for the weight ratio 1:1.5, where phylo-m found a tree with weight 119.5, while phylo-tm only found a tree with weight 120.5. While the algorithm was very fast for the weight ratio 1:2 (2:36 min per run for phylo-m and 6:13 for phylo-tm), the running times increased for the other weight ratios when the median solver was used. Nevertheless, the running times of about 1.5 hours per run are still feasible in practice. A comparison of our trees with those found by amGRP and MGR again showed that the direct usage of the weighted reversal and transposition distance is of advantage. Except for the weight ratio 1:2, for which amGRP found a better tree, the trees found by our program have a lower weight under the weighted reversal and transposition distance than the trees found by the other programs. Details can be found in Table 4. The variance of the different runs of our algorithm is depicted in Fig. 4.

Bottom Line: Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available.However, this phylogenetic reconstruction is a very challenging task.For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Theoretical Computer Science, University of Ulm, Ulm, Germany. martin.bader@uni-ulm.de

ABSTRACT

Background: Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available. However, this phylogenetic reconstruction is a very challenging task. For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.

Results: In this paper, we present a new heuristic algorithm that directly constructs a phylogenetic tree w.r.t. the weighted reversal and transposition distance. Experimental results on previously published datasets show that constructing phylogenetic trees in this way results in better trees than constructing the trees w.r.t. the reversal distance, and recalculating the weight of the trees with the weighted reversal and transposition distance. An implementation of the algorithm can be obtained from the authors.

Conclusion: The possibility of creating phylogenetic trees directly w.r.t. the weighted reversal and transposition distance results in biologically more realistic scenarios. Our algorithm can solve today's most challenging biological datasets in a reasonable amount of time.

Show MeSH
Related in: MedlinePlus