Limits...
A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions.

Bader M, Abouelhoda MI, Ohlebusch E - BMC Bioinformatics (2008)

Bottom Line: Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available.However, this phylogenetic reconstruction is a very challenging task.For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Theoretical Computer Science, University of Ulm, Ulm, Germany. martin.bader@uni-ulm.de

ABSTRACT

Background: Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available. However, this phylogenetic reconstruction is a very challenging task. For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.

Results: In this paper, we present a new heuristic algorithm that directly constructs a phylogenetic tree w.r.t. the weighted reversal and transposition distance. Experimental results on previously published datasets show that constructing phylogenetic trees in this way results in better trees than constructing the trees w.r.t. the reversal distance, and recalculating the weight of the trees with the weighted reversal and transposition distance. An implementation of the algorithm can be obtained from the authors.

Conclusion: The possibility of creating phylogenetic trees directly w.r.t. the weighted reversal and transposition distance results in biologically more realistic scenarios. Our algorithm can solve today's most challenging biological datasets in a reasonable amount of time.

Show MeSH

Related in: MedlinePlus

Improving the tree topology. The subtrees T1 and T2 are reconnected by an edge from s to v2, where s ∈ Cloud(v1, ) and v2 ∈ V2. The edge (v1, ) must be split into two edges (v1, s) and (s, ) before the new edge (s, v2) is added. Clouds are removed/generated accordingly.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2655097&req=5

Figure 2: Improving the tree topology. The subtrees T1 and T2 are reconnected by an edge from s to v2, where s ∈ Cloud(v1, ) and v2 ∈ V2. The edge (v1, ) must be split into two edges (v1, s) and (s, ) before the new edge (s, v2) is added. Clouds are removed/generated accordingly.

Mentions: The construction phase may get trapped in a local minimum. To avoid this, it is followed by an improvement phase, which iteratively tries to find edges that are better than the existing ones. The input to the improvement algorithm is a tree T' = (V', E') for P in conjunction with the clouds of the edges. The algorithm works as follows. First, we temporarily remove an edge e ∈ E'. This splits T' into two subtrees T1 = (V1, E1) and T2 = (V2, E2). Then we search for a better edge (w1, w2) that reconnects these two subtrees as follows. w1 is either a node in V1 or a permutation in a cloud of an edge (v1, ) ∈ E1. In the latter case, we alter the tree T1 into a tree by replacing the edge (v1, ) with the two edges (v1, w1) and (w1, ), i.e., = V1 ∪ {w1} and . Otherwise, we set = T1. The tree is defined analogously to by distinguishing as to whether w2 is a node in V2 or a permutation in a cloud of an edge (v2, ) ∈ E2. The new tree is obtained by connecting and by the edge (w1, w2), i.e., and . For an example, see Fig. 2. If w() <w(T'), these changes are accepted (i.e., T' := ), otherwise they are discarded. When the changes are accepted, we generate the clouds for the new edges and delete the clouds of removed edges.


A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions.

Bader M, Abouelhoda MI, Ohlebusch E - BMC Bioinformatics (2008)

Improving the tree topology. The subtrees T1 and T2 are reconnected by an edge from s to v2, where s ∈ Cloud(v1, ) and v2 ∈ V2. The edge (v1, ) must be split into two edges (v1, s) and (s, ) before the new edge (s, v2) is added. Clouds are removed/generated accordingly.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2655097&req=5

Figure 2: Improving the tree topology. The subtrees T1 and T2 are reconnected by an edge from s to v2, where s ∈ Cloud(v1, ) and v2 ∈ V2. The edge (v1, ) must be split into two edges (v1, s) and (s, ) before the new edge (s, v2) is added. Clouds are removed/generated accordingly.
Mentions: The construction phase may get trapped in a local minimum. To avoid this, it is followed by an improvement phase, which iteratively tries to find edges that are better than the existing ones. The input to the improvement algorithm is a tree T' = (V', E') for P in conjunction with the clouds of the edges. The algorithm works as follows. First, we temporarily remove an edge e ∈ E'. This splits T' into two subtrees T1 = (V1, E1) and T2 = (V2, E2). Then we search for a better edge (w1, w2) that reconnects these two subtrees as follows. w1 is either a node in V1 or a permutation in a cloud of an edge (v1, ) ∈ E1. In the latter case, we alter the tree T1 into a tree by replacing the edge (v1, ) with the two edges (v1, w1) and (w1, ), i.e., = V1 ∪ {w1} and . Otherwise, we set = T1. The tree is defined analogously to by distinguishing as to whether w2 is a node in V2 or a permutation in a cloud of an edge (v2, ) ∈ E2. The new tree is obtained by connecting and by the edge (w1, w2), i.e., and . For an example, see Fig. 2. If w() <w(T'), these changes are accepted (i.e., T' := ), otherwise they are discarded. When the changes are accepted, we generate the clouds for the new edges and delete the clouds of removed edges.

Bottom Line: Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available.However, this phylogenetic reconstruction is a very challenging task.For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Theoretical Computer Science, University of Ulm, Ulm, Germany. martin.bader@uni-ulm.de

ABSTRACT

Background: Due to recent progress in genome sequencing, more and more data for phylogenetic reconstruction based on rearrangement distances between genomes become available. However, this phylogenetic reconstruction is a very challenging task. For the most simple distance measures (the breakpoint distance and the reversal distance), the problem is NP-hard even if one considers only three genomes.

Results: In this paper, we present a new heuristic algorithm that directly constructs a phylogenetic tree w.r.t. the weighted reversal and transposition distance. Experimental results on previously published datasets show that constructing phylogenetic trees in this way results in better trees than constructing the trees w.r.t. the reversal distance, and recalculating the weight of the trees with the weighted reversal and transposition distance. An implementation of the algorithm can be obtained from the authors.

Conclusion: The possibility of creating phylogenetic trees directly w.r.t. the weighted reversal and transposition distance results in biologically more realistic scenarios. Our algorithm can solve today's most challenging biological datasets in a reasonable amount of time.

Show MeSH
Related in: MedlinePlus