Limits...
Phylogenetic network analysis as a parsimony optimization problem.

Wheeler WC - BMC Bioinformatics (2015)

Bottom Line: This is in contrast to softwired, where each character follows the lowest parsimony cost tree displayed by the network, resulting in costs which are less than or equal to the best display tree.In each case, the favored graph representation (tree or network) matched expectation or simulation scenario.The softwired network cost regime proposed here presents a quantitative criterion for an optimality-based search procedure where trees and networks can participate in hypothesis testing simultaneously.

View Article: PubMed Central - PubMed

Affiliation: Division of Invertebrate Zoology, American Museum of Natural History, Central Park West @ 79th Street, New York, 10024-5192, NY, USA. wheeler@amnh.org.

ABSTRACT

Background: Many problems in comparative biology are, or are thought to be, best expressed as phylogenetic "networks" as opposed to trees. In trees, vertices may have only a single parent (ancestor), while networks allow for multiple parent vertices. There are two main interpretive types of networks, "softwired" and "hardwired." The parsimony cost of hardwired networks is based on all changes over all edges, hence must be greater than or equal to the best tree cost contained ("displayed") by the network. This is in contrast to softwired, where each character follows the lowest parsimony cost tree displayed by the network, resulting in costs which are less than or equal to the best display tree. Neither situation is ideal since hard-wired networks are not generally biologically attractive (since individual heritable characters can have more than one parent) and softwired networks can be trivially optimized (containing the best tree for each character). Furthermore, given the alternate cost scenarios of trees and these two flavors of networks, hypothesis testing among these explanatory scenarios is impossible.

Results: A network cost adjustment (penalty) is proposed to allow phylogenetic trees and soft-wired phylogenetic networks to compete equally on a parsimony optimality basis. This cost is demonstrated for several real and simulated datasets. In each case, the favored graph representation (tree or network) matched expectation or simulation scenario.

Conclusions: The softwired network cost regime proposed here presents a quantitative criterion for an optimality-based search procedure where trees and networks can participate in hypothesis testing simultaneously.

No MeSH data available.


Related in: MedlinePlus

Avian influenza tree (top, based on concatenated data) and network (bottom). Network edges in red. Internal vertices are labelled “rN”. Data from [3]
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4574467&req=5

Fig6: Avian influenza tree (top, based on concatenated data) and network (bottom). Network edges in red. Internal vertices are labelled “rN”. Data from [3]

Mentions: A combination of Wagner random addition sequences (100 replicates), TBR refinement, and tree recombination (fusing) [31, 32] was employed for each analysis. Partitioned analyses are shown in Figs. 3 and 4. Candidate network scenarios were created in two ways. For the microhylid data, loci were analyzed independently (Fig. 3) and edges added to the simultaneous tree solution to create the candidate network. These network edges were based on minimum hybridization networks derived using Dendrosope [33] (Fig. 5). Networks were diagnosed using a prototype network tool, PhylogeneticComponentGraph (PCG; https://github.com/wardwheeler/PhyloComGraph.git) reading fasta and extended newick [34] files using the commands read(~*.fas~) read (newick:~network.enewick~). Currently, networks can only be diagnosed from input, not searched. With the influenza data, the reassortment scenario of [3], was used for network diagnosis (Fig. 6). For the linguistic data, the base tree of [27] was used, augmented by a scenario of Yuman-Takic exchange (in loanwords suggested by Jane Hill recorded in Kenneth C. [35]) (one edge; Fig. 7). Other exchanges regarded as unlikely (e.g., Aztec–Shoshone, Western Mono–(Eudeve + Òpata)) were tested as well.Fig. 3


Phylogenetic network analysis as a parsimony optimization problem.

Wheeler WC - BMC Bioinformatics (2015)

Avian influenza tree (top, based on concatenated data) and network (bottom). Network edges in red. Internal vertices are labelled “rN”. Data from [3]
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4574467&req=5

Fig6: Avian influenza tree (top, based on concatenated data) and network (bottom). Network edges in red. Internal vertices are labelled “rN”. Data from [3]
Mentions: A combination of Wagner random addition sequences (100 replicates), TBR refinement, and tree recombination (fusing) [31, 32] was employed for each analysis. Partitioned analyses are shown in Figs. 3 and 4. Candidate network scenarios were created in two ways. For the microhylid data, loci were analyzed independently (Fig. 3) and edges added to the simultaneous tree solution to create the candidate network. These network edges were based on minimum hybridization networks derived using Dendrosope [33] (Fig. 5). Networks were diagnosed using a prototype network tool, PhylogeneticComponentGraph (PCG; https://github.com/wardwheeler/PhyloComGraph.git) reading fasta and extended newick [34] files using the commands read(~*.fas~) read (newick:~network.enewick~). Currently, networks can only be diagnosed from input, not searched. With the influenza data, the reassortment scenario of [3], was used for network diagnosis (Fig. 6). For the linguistic data, the base tree of [27] was used, augmented by a scenario of Yuman-Takic exchange (in loanwords suggested by Jane Hill recorded in Kenneth C. [35]) (one edge; Fig. 7). Other exchanges regarded as unlikely (e.g., Aztec–Shoshone, Western Mono–(Eudeve + Òpata)) were tested as well.Fig. 3

Bottom Line: This is in contrast to softwired, where each character follows the lowest parsimony cost tree displayed by the network, resulting in costs which are less than or equal to the best display tree.In each case, the favored graph representation (tree or network) matched expectation or simulation scenario.The softwired network cost regime proposed here presents a quantitative criterion for an optimality-based search procedure where trees and networks can participate in hypothesis testing simultaneously.

View Article: PubMed Central - PubMed

Affiliation: Division of Invertebrate Zoology, American Museum of Natural History, Central Park West @ 79th Street, New York, 10024-5192, NY, USA. wheeler@amnh.org.

ABSTRACT

Background: Many problems in comparative biology are, or are thought to be, best expressed as phylogenetic "networks" as opposed to trees. In trees, vertices may have only a single parent (ancestor), while networks allow for multiple parent vertices. There are two main interpretive types of networks, "softwired" and "hardwired." The parsimony cost of hardwired networks is based on all changes over all edges, hence must be greater than or equal to the best tree cost contained ("displayed") by the network. This is in contrast to softwired, where each character follows the lowest parsimony cost tree displayed by the network, resulting in costs which are less than or equal to the best display tree. Neither situation is ideal since hard-wired networks are not generally biologically attractive (since individual heritable characters can have more than one parent) and softwired networks can be trivially optimized (containing the best tree for each character). Furthermore, given the alternate cost scenarios of trees and these two flavors of networks, hypothesis testing among these explanatory scenarios is impossible.

Results: A network cost adjustment (penalty) is proposed to allow phylogenetic trees and soft-wired phylogenetic networks to compete equally on a parsimony optimality basis. This cost is demonstrated for several real and simulated datasets. In each case, the favored graph representation (tree or network) matched expectation or simulation scenario.

Conclusions: The softwired network cost regime proposed here presents a quantitative criterion for an optimality-based search procedure where trees and networks can participate in hypothesis testing simultaneously.

No MeSH data available.


Related in: MedlinePlus