Limits...
Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference.

Chernomor O, Minh BQ, von Haeseler A - J. Comput. Biol. (2015)

Bottom Line: Therefore, if the topological rearrangement applied to a species tree does not change the induced partition trees, the score of these partition trees is unchanged.We also introduce the concept of partial terraces and demonstrate that they occur more frequently than the original "full" terrace.Hence, partial terrace is the more important factor of timesaving compared to full terrace.

View Article: PubMed Central - PubMed

Affiliation: 1 Max F. Perutz Laboratories, Center for Integrative Bioinformatics Vienna, University of Vienna , Vienna, Austria .

ABSTRACT
In phylogenomic analysis the collection of trees with identical score (maximum likelihood or parsimony score) may hamper tree search algorithms. Such collections are coined phylogenetic terraces. For sparse supermatrices with a lot of missing data, the number of terraces and the number of trees on the terraces can be very large. If terraces are not taken into account, a lot of computation time might be unnecessarily spent to evaluate many trees that in fact have identical score. To save computation time during the tree search, it is worthwhile to quickly identify such cases. The score of a species tree is the sum of scores for all the so-called induced partition trees. Therefore, if the topological rearrangement applied to a species tree does not change the induced partition trees, the score of these partition trees is unchanged. Here, we provide the conditions under which the three most widely used topological rearrangements (nearest neighbor interchange, subtree pruning and regrafting, and tree bisection and reconnection) change the topologies of induced partition trees. During the tree search, these conditions allow us to quickly identify whether we can save computation time on the evaluation of newly encountered trees. We also introduce the concept of partial terraces and demonstrate that they occur more frequently than the original "full" terrace. Hence, partial terrace is the more important factor of timesaving compared to full terrace. Therefore, taking into account the above conditions and the partial terrace concept will help to speed up the tree search in phylogenomic inference.

No MeSH data available.


Related in: MedlinePlus

An example when an NNI on T does not change the topology of T/Y. Solid lines correspond to two induced partition trees before (T/Y) and after (TNNI/Y) the NNI was applied to edge e on T by swapping the subtrees below e1 and e3 (Fig. 1). In this case, Y does not have a representative in A (i.e., A ∩ Y = ∅); therefore, (A ∪ B)∩Y/(C∪D)∩Y =B∩Y/(C∪D)∩Y and (A∪D)∩Y/(B∪C)∩Y = D∩Y/(B∪C)∩Y. Since the splits B∩Y/(C∪D)∩Y and D∩Y/(B∪C)∩Y are shared by T/Y and TNNI/Y, then Σ(T/Y) Δ Σ(TNNI/Y) = ∅ and RF distance between T/Y and TNNI/Y is 0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4663649&req=5

f4: An example when an NNI on T does not change the topology of T/Y. Solid lines correspond to two induced partition trees before (T/Y) and after (TNNI/Y) the NNI was applied to edge e on T by swapping the subtrees below e1 and e3 (Fig. 1). In this case, Y does not have a representative in A (i.e., A ∩ Y = ∅); therefore, (A ∪ B)∩Y/(C∪D)∩Y =B∩Y/(C∪D)∩Y and (A∪D)∩Y/(B∪C)∩Y = D∩Y/(B∪C)∩Y. Since the splits B∩Y/(C∪D)∩Y and D∩Y/(B∪C)∩Y are shared by T/Y and TNNI/Y, then Σ(T/Y) Δ Σ(TNNI/Y) = ∅ and RF distance between T/Y and TNNI/Y is 0.

Mentions: It is easy to show that if at least one set from A ∩ Y, B ∩ Y, C ∩ Y, D ∩ Y were empty, then both splits (A ∪ B) ∩ Y/(C ∪ D) ∩ Y and (A ∪ D) ∩ Y/(B ∪ C) ∩ Y coincide with splits shared by T/Y and TNNI/Y (e.g., see Fig. 4). Hence, and the RF distance between these trees would be 0. Therefore, for T/Y and TNNI/Y to have different topologies, all A ∩ Y, B ∩ Y, C ∩ Y, D ∩ Y must be nonempty, meaning that Y has to have at least one representative in each subset A, B, C, D.   ■


Consequences of Common Topological Rearrangements for Partition Trees in Phylogenomic Inference.

Chernomor O, Minh BQ, von Haeseler A - J. Comput. Biol. (2015)

An example when an NNI on T does not change the topology of T/Y. Solid lines correspond to two induced partition trees before (T/Y) and after (TNNI/Y) the NNI was applied to edge e on T by swapping the subtrees below e1 and e3 (Fig. 1). In this case, Y does not have a representative in A (i.e., A ∩ Y = ∅); therefore, (A ∪ B)∩Y/(C∪D)∩Y =B∩Y/(C∪D)∩Y and (A∪D)∩Y/(B∪C)∩Y = D∩Y/(B∪C)∩Y. Since the splits B∩Y/(C∪D)∩Y and D∩Y/(B∪C)∩Y are shared by T/Y and TNNI/Y, then Σ(T/Y) Δ Σ(TNNI/Y) = ∅ and RF distance between T/Y and TNNI/Y is 0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4663649&req=5

f4: An example when an NNI on T does not change the topology of T/Y. Solid lines correspond to two induced partition trees before (T/Y) and after (TNNI/Y) the NNI was applied to edge e on T by swapping the subtrees below e1 and e3 (Fig. 1). In this case, Y does not have a representative in A (i.e., A ∩ Y = ∅); therefore, (A ∪ B)∩Y/(C∪D)∩Y =B∩Y/(C∪D)∩Y and (A∪D)∩Y/(B∪C)∩Y = D∩Y/(B∪C)∩Y. Since the splits B∩Y/(C∪D)∩Y and D∩Y/(B∪C)∩Y are shared by T/Y and TNNI/Y, then Σ(T/Y) Δ Σ(TNNI/Y) = ∅ and RF distance between T/Y and TNNI/Y is 0.
Mentions: It is easy to show that if at least one set from A ∩ Y, B ∩ Y, C ∩ Y, D ∩ Y were empty, then both splits (A ∪ B) ∩ Y/(C ∪ D) ∩ Y and (A ∪ D) ∩ Y/(B ∪ C) ∩ Y coincide with splits shared by T/Y and TNNI/Y (e.g., see Fig. 4). Hence, and the RF distance between these trees would be 0. Therefore, for T/Y and TNNI/Y to have different topologies, all A ∩ Y, B ∩ Y, C ∩ Y, D ∩ Y must be nonempty, meaning that Y has to have at least one representative in each subset A, B, C, D.   ■

Bottom Line: Therefore, if the topological rearrangement applied to a species tree does not change the induced partition trees, the score of these partition trees is unchanged.We also introduce the concept of partial terraces and demonstrate that they occur more frequently than the original "full" terrace.Hence, partial terrace is the more important factor of timesaving compared to full terrace.

View Article: PubMed Central - PubMed

Affiliation: 1 Max F. Perutz Laboratories, Center for Integrative Bioinformatics Vienna, University of Vienna , Vienna, Austria .

ABSTRACT
In phylogenomic analysis the collection of trees with identical score (maximum likelihood or parsimony score) may hamper tree search algorithms. Such collections are coined phylogenetic terraces. For sparse supermatrices with a lot of missing data, the number of terraces and the number of trees on the terraces can be very large. If terraces are not taken into account, a lot of computation time might be unnecessarily spent to evaluate many trees that in fact have identical score. To save computation time during the tree search, it is worthwhile to quickly identify such cases. The score of a species tree is the sum of scores for all the so-called induced partition trees. Therefore, if the topological rearrangement applied to a species tree does not change the induced partition trees, the score of these partition trees is unchanged. Here, we provide the conditions under which the three most widely used topological rearrangements (nearest neighbor interchange, subtree pruning and regrafting, and tree bisection and reconnection) change the topologies of induced partition trees. During the tree search, these conditions allow us to quickly identify whether we can save computation time on the evaluation of newly encountered trees. We also introduce the concept of partial terraces and demonstrate that they occur more frequently than the original "full" terrace. Hence, partial terrace is the more important factor of timesaving compared to full terrace. Therefore, taking into account the above conditions and the partial terrace concept will help to speed up the tree search in phylogenomic inference.

No MeSH data available.


Related in: MedlinePlus