Limits...
Polynomial algorithms for the Maximal Pairing Problem: efficient phylogenetic targeting on arbitrary trees.

Arnold C, Stadler PF - Algorithms Mol Biol (2010)

Bottom Line: We describe a relatively simple dynamic programming algorithm for the special case of binary trees.We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n4 log n) w.r.t. the number n of leaves.This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany. studla@bioinf.uni-leipzig.de.

ABSTRACT

Background: The Maximal Pairing Problem (MPP) is the prototype of a class of combinatorial optimization problems that are of considerable interest in bioinformatics: Given an arbitrary phylogenetic tree T and weights omegaxy for the paths between any two pairs of leaves (x, y), what is the collection of edge-disjoint paths between pairs of leaves that maximizes the total weight? Special cases of the MPP for binary trees and equal weights have been described previously; algorithms to solve the general MPP are still missing, however.

Results: We describe a relatively simple dynamic programming algorithm for the special case of binary trees. We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n4 log n) w.r.t. the number n of leaves. The source code of a C implementation can be obtained under the GNU Public License from http://www.bioinf.uni-leipzig.de/Software/Targeting. For binary trees, we furthermore discuss several constrained variants of the MPP as well as a partition function approach to the probabilistic version of the MPP.

Conclusions: The algorithms introduced here make it possible to solve the MPP also for large trees with high-degree vertices. This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.

No MeSH data available.


Related in: MedlinePlus

Three different path-systems on a tree with 15 leaves. Each path is shown in a distinctive color, and unused edges of the tree are shown as thin black lines. Clearly, no two paths share an edge, i.e., the corresponding collection of pairs of leaves is phylogenetically independent. Note that the paths are not necessarily vertex-disjoint.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2902485&req=5

Figure 1: Three different path-systems on a tree with 15 leaves. Each path is shown in a distinctive color, and unused edges of the tree are shown as thin black lines. Clearly, no two paths share an edge, i.e., the corresponding collection of pairs of leaves is phylogenetically independent. Note that the paths are not necessarily vertex-disjoint.

Mentions: Note that two paths in ϒ have at most one vertex in common (otherwise they would also share the sub-path, and therefore edges, between two common vertices). In binary trees, two edge-disjoint paths are also vertex-disjoint, since two edge-disjoint paths can only run through an interior vertex u with /chd(u)/ ≥ 3 (see Fig. 1). Two edge-disjoint paths can share a vertex u in two distinct situations: (1) if both paths have u as the last common ancestor of their respective leaves, u must have at least four children, (2) if u is the last common ancestor for one path, while the other path also includes an ancestor of u, three children of u are sufficient. These two situations will also lead to distinct cases in the algorithms that are presented next.


Polynomial algorithms for the Maximal Pairing Problem: efficient phylogenetic targeting on arbitrary trees.

Arnold C, Stadler PF - Algorithms Mol Biol (2010)

Three different path-systems on a tree with 15 leaves. Each path is shown in a distinctive color, and unused edges of the tree are shown as thin black lines. Clearly, no two paths share an edge, i.e., the corresponding collection of pairs of leaves is phylogenetically independent. Note that the paths are not necessarily vertex-disjoint.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2902485&req=5

Figure 1: Three different path-systems on a tree with 15 leaves. Each path is shown in a distinctive color, and unused edges of the tree are shown as thin black lines. Clearly, no two paths share an edge, i.e., the corresponding collection of pairs of leaves is phylogenetically independent. Note that the paths are not necessarily vertex-disjoint.
Mentions: Note that two paths in ϒ have at most one vertex in common (otherwise they would also share the sub-path, and therefore edges, between two common vertices). In binary trees, two edge-disjoint paths are also vertex-disjoint, since two edge-disjoint paths can only run through an interior vertex u with /chd(u)/ ≥ 3 (see Fig. 1). Two edge-disjoint paths can share a vertex u in two distinct situations: (1) if both paths have u as the last common ancestor of their respective leaves, u must have at least four children, (2) if u is the last common ancestor for one path, while the other path also includes an ancestor of u, three children of u are sufficient. These two situations will also lead to distinct cases in the algorithms that are presented next.

Bottom Line: We describe a relatively simple dynamic programming algorithm for the special case of binary trees.We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n4 log n) w.r.t. the number n of leaves.This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany. studla@bioinf.uni-leipzig.de.

ABSTRACT

Background: The Maximal Pairing Problem (MPP) is the prototype of a class of combinatorial optimization problems that are of considerable interest in bioinformatics: Given an arbitrary phylogenetic tree T and weights omegaxy for the paths between any two pairs of leaves (x, y), what is the collection of edge-disjoint paths between pairs of leaves that maximizes the total weight? Special cases of the MPP for binary trees and equal weights have been described previously; algorithms to solve the general MPP are still missing, however.

Results: We describe a relatively simple dynamic programming algorithm for the special case of binary trees. We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n4 log n) w.r.t. the number n of leaves. The source code of a C implementation can be obtained under the GNU Public License from http://www.bioinf.uni-leipzig.de/Software/Targeting. For binary trees, we furthermore discuss several constrained variants of the MPP as well as a partition function approach to the probabilistic version of the MPP.

Conclusions: The algorithms introduced here make it possible to solve the MPP also for large trees with high-degree vertices. This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.

No MeSH data available.


Related in: MedlinePlus