Limits...
Taxon ordering in phylogenetic trees: a workbench test.

Cerutti F, Bertolotti L, Goldberg TL, Giacobini M - BMC Bioinformatics (2011)

Bottom Line: Best results were obtained when taxa were reordered using geographic information.Improved representations of genetic and geographic relationships between samples were also obtained when merged matrices (genetic and geographic information in one matrix) were used.Our innovative method makes phylogenetic trees easier to interpret, adding meaning to the taxon order and helping to prevent misinterpretations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Animal Production, Epidemiology and Ecology, Faculty of Veterinary Medicine, University of Torino, Grugliasco, Italy.

ABSTRACT

Background: Phylogenetic trees are an important tool for representing evolutionary relationships among organisms. In a phylogram or chronogram, the ordering of taxa is not considered meaningful, since complete topological information is given by the branching order and length of the branches, which are represented in the root-to-node direction. We apply a novel method based on a (λ + μ)-Evolutionary Algorithm to give meaning to the order of taxa in a phylogeny. This method applies random swaps between two taxa connected to the same node, without changing the topology of the tree. The evaluation of a new tree is based on different distance matrices, representing non-phylogenetic information such as other types of genetic distance, geographic distance, or combinations of these. To test our method we use published trees of Vesicular stomatitis virus, West Nile virus and Rice yellow mottle virus.

Results: Best results were obtained when taxa were reordered using geographic information. Information supporting phylogeographic analysis was recovered in the optimized tree, as evidenced by clustering of geographically close samples. Improving the trees using a separate genetic distance matrix altered the ordering of taxa, but not topology, moving the longest branches to the extremities, as would be expected since they are the most divergent lineages. Improved representations of genetic and geographic relationships between samples were also obtained when merged matrices (genetic and geographic information in one matrix) were used.

Conclusions: Our innovative method makes phylogenetic trees easier to interpret, adding meaning to the taxon order and helping to prevent misinterpretations.

Show MeSH

Related in: MedlinePlus

Map of VSV samples. Map representing the study area of USA and Mexico where VSV samples were collected. Each site has a color, that is the same of the tips of the samples collected in it.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3050728&req=5

Figure 3: Map of VSV samples. Map representing the study area of USA and Mexico where VSV samples were collected. Each site has a color, that is the same of the tips of the samples collected in it.

Mentions: VSV is a negative-sense single-stranded RNA virus member of the family Rhabdoviridae, which causes vesicular stomatitis in horses, cattle, swine, and certain wildlife species [15]. Starting from the tree originally published in [12], we created an Euclidean distance matrix from collection sites coordinates. Then, the EA was run 50 times, each starting with the original tree plus λ-1 different initial trees (see Methods), generated from the original one (Figure 2a). The performance of the runs was comparable, and we analyzed the best trees (Figure 2b). The order of taxa follows a clear north-south progression, reflecting the geographic arrangement of collection sites, represented in the map in Figure 3. In other words, the algorithm was able to group those taxa belonging to the same state (geographically closer each other). We also tested the algorithm using a separate genetic distance matrix and a combined genetic and geographic distance matrix. Using genetic distances only, the improvement in tree representation is less evident, as might be expected considering that the original tree was constructed using genetic data. However, the EA returns a tree in which the most genetically divergent clades are moved to the extremities of the phylogeny, leading to a "C"-like shape, as shown in Figure 2c. The effect of combining the two matrices is strongly evident: the tree has the same "C"-like shape arrangement as the genetically modified tree, and moreover it conserves the aggregation of taxa from same states (locations) (Figure 2d).


Taxon ordering in phylogenetic trees: a workbench test.

Cerutti F, Bertolotti L, Goldberg TL, Giacobini M - BMC Bioinformatics (2011)

Map of VSV samples. Map representing the study area of USA and Mexico where VSV samples were collected. Each site has a color, that is the same of the tips of the samples collected in it.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3050728&req=5

Figure 3: Map of VSV samples. Map representing the study area of USA and Mexico where VSV samples were collected. Each site has a color, that is the same of the tips of the samples collected in it.
Mentions: VSV is a negative-sense single-stranded RNA virus member of the family Rhabdoviridae, which causes vesicular stomatitis in horses, cattle, swine, and certain wildlife species [15]. Starting from the tree originally published in [12], we created an Euclidean distance matrix from collection sites coordinates. Then, the EA was run 50 times, each starting with the original tree plus λ-1 different initial trees (see Methods), generated from the original one (Figure 2a). The performance of the runs was comparable, and we analyzed the best trees (Figure 2b). The order of taxa follows a clear north-south progression, reflecting the geographic arrangement of collection sites, represented in the map in Figure 3. In other words, the algorithm was able to group those taxa belonging to the same state (geographically closer each other). We also tested the algorithm using a separate genetic distance matrix and a combined genetic and geographic distance matrix. Using genetic distances only, the improvement in tree representation is less evident, as might be expected considering that the original tree was constructed using genetic data. However, the EA returns a tree in which the most genetically divergent clades are moved to the extremities of the phylogeny, leading to a "C"-like shape, as shown in Figure 2c. The effect of combining the two matrices is strongly evident: the tree has the same "C"-like shape arrangement as the genetically modified tree, and moreover it conserves the aggregation of taxa from same states (locations) (Figure 2d).

Bottom Line: Best results were obtained when taxa were reordered using geographic information.Improved representations of genetic and geographic relationships between samples were also obtained when merged matrices (genetic and geographic information in one matrix) were used.Our innovative method makes phylogenetic trees easier to interpret, adding meaning to the taxon order and helping to prevent misinterpretations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Animal Production, Epidemiology and Ecology, Faculty of Veterinary Medicine, University of Torino, Grugliasco, Italy.

ABSTRACT

Background: Phylogenetic trees are an important tool for representing evolutionary relationships among organisms. In a phylogram or chronogram, the ordering of taxa is not considered meaningful, since complete topological information is given by the branching order and length of the branches, which are represented in the root-to-node direction. We apply a novel method based on a (λ + μ)-Evolutionary Algorithm to give meaning to the order of taxa in a phylogeny. This method applies random swaps between two taxa connected to the same node, without changing the topology of the tree. The evaluation of a new tree is based on different distance matrices, representing non-phylogenetic information such as other types of genetic distance, geographic distance, or combinations of these. To test our method we use published trees of Vesicular stomatitis virus, West Nile virus and Rice yellow mottle virus.

Results: Best results were obtained when taxa were reordered using geographic information. Information supporting phylogeographic analysis was recovered in the optimized tree, as evidenced by clustering of geographically close samples. Improving the trees using a separate genetic distance matrix altered the ordering of taxa, but not topology, moving the longest branches to the extremities, as would be expected since they are the most divergent lineages. Improved representations of genetic and geographic relationships between samples were also obtained when merged matrices (genetic and geographic information in one matrix) were used.

Conclusions: Our innovative method makes phylogenetic trees easier to interpret, adding meaning to the taxon order and helping to prevent misinterpretations.

Show MeSH
Related in: MedlinePlus