Limits...
Polynomial algorithms for the Maximal Pairing Problem: efficient phylogenetic targeting on arbitrary trees.

Arnold C, Stadler PF - Algorithms Mol Biol (2010)

Bottom Line: We describe a relatively simple dynamic programming algorithm for the special case of binary trees.We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n4 log n) w.r.t. the number n of leaves.This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany. studla@bioinf.uni-leipzig.de.

ABSTRACT

Background: The Maximal Pairing Problem (MPP) is the prototype of a class of combinatorial optimization problems that are of considerable interest in bioinformatics: Given an arbitrary phylogenetic tree T and weights omegaxy for the paths between any two pairs of leaves (x, y), what is the collection of edge-disjoint paths between pairs of leaves that maximizes the total weight? Special cases of the MPP for binary trees and equal weights have been described previously; algorithms to solve the general MPP are still missing, however.

Results: We describe a relatively simple dynamic programming algorithm for the special case of binary trees. We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n4 log n) w.r.t. the number n of leaves. The source code of a C implementation can be obtained under the GNU Public License from http://www.bioinf.uni-leipzig.de/Software/Targeting. For binary trees, we furthermore discuss several constrained variants of the MPP as well as a partition function approach to the probabilistic version of the MPP.

Conclusions: The algorithms introduced here make it possible to solve the MPP also for large trees with high-degree vertices. This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.

No MeSH data available.


Related in: MedlinePlus

A binary tree for which only one possible path-system exists that fulfills all constraints. Leaves that must appear in the output are highlighted with an arrow, and the (only) valid path-system is displayed in color. Note that the score of the subtree T[k] = ∞, because no path-system in T[k] exists that includes all three leaves x ∈ T[k]. The score of T[h], however, is greater than 0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2902485&req=5

Figure 3: A binary tree for which only one possible path-system exists that fulfills all constraints. Leaves that must appear in the output are highlighted with an arrow, and the (only) valid path-system is displayed in color. Note that the score of the subtree T[k] = ∞, because no path-system in T[k] exists that includes all three leaves x ∈ T[k]. The score of T[h], however, is greater than 0.

Mentions: For binary trees, this variant can be implemented by conditioning the matrices R and S to a subset of all possible paths and leaves. This is achieved by setting the score to -∞ for a particular interior vertex if one of the preconditions cannot be met in eqns.(2) and (3). For example, if two leaves x, y ∈ Z have the same father u, an optimal path-system of both T[u] and T must contain the path πxy, because otherwise, either x or y would not belong to the optimal path-system due to the requirement of independence. Similarly, if a particular path πxy in the second alternative achieves the highest score in eq.(3), πxy must not be selected if this conflicts with the possibility to select other prescribed leaves z ∈ Z (Fig. 3).


Polynomial algorithms for the Maximal Pairing Problem: efficient phylogenetic targeting on arbitrary trees.

Arnold C, Stadler PF - Algorithms Mol Biol (2010)

A binary tree for which only one possible path-system exists that fulfills all constraints. Leaves that must appear in the output are highlighted with an arrow, and the (only) valid path-system is displayed in color. Note that the score of the subtree T[k] = ∞, because no path-system in T[k] exists that includes all three leaves x ∈ T[k]. The score of T[h], however, is greater than 0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2902485&req=5

Figure 3: A binary tree for which only one possible path-system exists that fulfills all constraints. Leaves that must appear in the output are highlighted with an arrow, and the (only) valid path-system is displayed in color. Note that the score of the subtree T[k] = ∞, because no path-system in T[k] exists that includes all three leaves x ∈ T[k]. The score of T[h], however, is greater than 0.
Mentions: For binary trees, this variant can be implemented by conditioning the matrices R and S to a subset of all possible paths and leaves. This is achieved by setting the score to -∞ for a particular interior vertex if one of the preconditions cannot be met in eqns.(2) and (3). For example, if two leaves x, y ∈ Z have the same father u, an optimal path-system of both T[u] and T must contain the path πxy, because otherwise, either x or y would not belong to the optimal path-system due to the requirement of independence. Similarly, if a particular path πxy in the second alternative achieves the highest score in eq.(3), πxy must not be selected if this conflicts with the possibility to select other prescribed leaves z ∈ Z (Fig. 3).

Bottom Line: We describe a relatively simple dynamic programming algorithm for the special case of binary trees.We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n4 log n) w.r.t. the number n of leaves.This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany. studla@bioinf.uni-leipzig.de.

ABSTRACT

Background: The Maximal Pairing Problem (MPP) is the prototype of a class of combinatorial optimization problems that are of considerable interest in bioinformatics: Given an arbitrary phylogenetic tree T and weights omegaxy for the paths between any two pairs of leaves (x, y), what is the collection of edge-disjoint paths between pairs of leaves that maximizes the total weight? Special cases of the MPP for binary trees and equal weights have been described previously; algorithms to solve the general MPP are still missing, however.

Results: We describe a relatively simple dynamic programming algorithm for the special case of binary trees. We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n4 log n) w.r.t. the number n of leaves. The source code of a C implementation can be obtained under the GNU Public License from http://www.bioinf.uni-leipzig.de/Software/Targeting. For binary trees, we furthermore discuss several constrained variants of the MPP as well as a partition function approach to the probabilistic version of the MPP.

Conclusions: The algorithms introduced here make it possible to solve the MPP also for large trees with high-degree vertices. This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.

No MeSH data available.


Related in: MedlinePlus