Limits...
Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees.

Keller A, Förster F, Müller T, Dandekar T, Schultz J, Wolf M - Biol. Direct (2010)

Bottom Line: An extensive evaluation of the benefits of secondary structure, however, is lacking.The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness.Koonin.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Bioinformatics, University of Würzburg, Am Hubland, 97074 Würzburg, Germany.

ABSTRACT

Background: In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking.

Results: This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness.

Conclusions: Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion.

Reviewers: This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin.

Open peer review: Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. For the full reviews, please go to the Reviewers' comments section.

Show MeSH
Bootstrap support values for trees with variable branch lengths. Subfigures are explained in Figure 1. Sample sizes are 7,000, 11,000 and 15,000 for each of the ten, 14 and 18 taxa scenarios, respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2821295&req=5

Figure 3: Bootstrap support values for trees with variable branch lengths. Subfigures are explained in Figure 1. Sample sizes are 7,000, 11,000 and 15,000 for each of the ten, 14 and 18 taxa scenarios, respectively.

Mentions: The shapes of bootstrap, Quartet distance and Robinson-Foulds distance distributions were similar for equidistant and variable distance trees. However, the branches of the trees for each underlying data set (sequence, sequence-structure and doubled sequence) received higher bootstrap support values and fewer false splits with constant branch lengths compared to variable distances, though differences were minimal (Figs. 1, 2, 3 and 4). Only Quartet distances are shown, since they are congruent with the results of the Robinson-Foulds distance (Additional file 1). Additionally, we included a relative per-branch representation of accuracy divided by the number of internal nodes in the Additional file 1.


Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees.

Keller A, Förster F, Müller T, Dandekar T, Schultz J, Wolf M - Biol. Direct (2010)

Bootstrap support values for trees with variable branch lengths. Subfigures are explained in Figure 1. Sample sizes are 7,000, 11,000 and 15,000 for each of the ten, 14 and 18 taxa scenarios, respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2821295&req=5

Figure 3: Bootstrap support values for trees with variable branch lengths. Subfigures are explained in Figure 1. Sample sizes are 7,000, 11,000 and 15,000 for each of the ten, 14 and 18 taxa scenarios, respectively.
Mentions: The shapes of bootstrap, Quartet distance and Robinson-Foulds distance distributions were similar for equidistant and variable distance trees. However, the branches of the trees for each underlying data set (sequence, sequence-structure and doubled sequence) received higher bootstrap support values and fewer false splits with constant branch lengths compared to variable distances, though differences were minimal (Figs. 1, 2, 3 and 4). Only Quartet distances are shown, since they are congruent with the results of the Robinson-Foulds distance (Additional file 1). Additionally, we included a relative per-branch representation of accuracy divided by the number of internal nodes in the Additional file 1.

Bottom Line: An extensive evaluation of the benefits of secondary structure, however, is lacking.The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness.Koonin.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Bioinformatics, University of Würzburg, Am Hubland, 97074 Würzburg, Germany.

ABSTRACT

Background: In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking.

Results: This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness.

Conclusions: Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion.

Reviewers: This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin.

Open peer review: Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. For the full reviews, please go to the Reviewers' comments section.

Show MeSH