Limits...
Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees.

Keller A, Förster F, Müller T, Dandekar T, Schultz J, Wolf M - Biol. Direct (2010)

Bottom Line: An extensive evaluation of the benefits of secondary structure, however, is lacking.The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness.Koonin.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Bioinformatics, University of Würzburg, Am Hubland, 97074 Würzburg, Germany.

ABSTRACT

Background: In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking.

Results: This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness.

Conclusions: Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion.

Reviewers: This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin.

Open peer review: Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. For the full reviews, please go to the Reviewers' comments section.

Show MeSH

Related in: MedlinePlus

Tree topology of the plants example. Left side: topology and bootstrap values of sequence only data. Right side: corresponding tree with inclusion of secondary structure. Families of the species are given at the right end. GenBank identifiers are in parenthesis after the species names.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2821295&req=5

Figure 5: Tree topology of the plants example. Left side: topology and bootstrap values of sequence only data. Right side: corresponding tree with inclusion of secondary structure. Families of the species are given at the right end. GenBank identifiers are in parenthesis after the species names.

Mentions: The results of trees reconstructed with sequence data and sequence-structure data for the plant example were very different. Sequence only information resulted in a correct topology reconstruction of genera (Fig. 5). However, the family of the Malvaceae could not be resolved. This supports the notion that the optimum divergence level of ITS2 sequences is at the species/genus level (see as well Additional file 2). By contrast, all genera and families could be resolved with secondary structures. This results in a flawless tree topology and highlights the improved accuracy. Furthermore, the robustness of the tree has been enhanced and the optimal divergence level has been widened.


Including RNA secondary structures improves accuracy and robustness in reconstruction of phylogenetic trees.

Keller A, Förster F, Müller T, Dandekar T, Schultz J, Wolf M - Biol. Direct (2010)

Tree topology of the plants example. Left side: topology and bootstrap values of sequence only data. Right side: corresponding tree with inclusion of secondary structure. Families of the species are given at the right end. GenBank identifiers are in parenthesis after the species names.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2821295&req=5

Figure 5: Tree topology of the plants example. Left side: topology and bootstrap values of sequence only data. Right side: corresponding tree with inclusion of secondary structure. Families of the species are given at the right end. GenBank identifiers are in parenthesis after the species names.
Mentions: The results of trees reconstructed with sequence data and sequence-structure data for the plant example were very different. Sequence only information resulted in a correct topology reconstruction of genera (Fig. 5). However, the family of the Malvaceae could not be resolved. This supports the notion that the optimum divergence level of ITS2 sequences is at the species/genus level (see as well Additional file 2). By contrast, all genera and families could be resolved with secondary structures. This results in a flawless tree topology and highlights the improved accuracy. Furthermore, the robustness of the tree has been enhanced and the optimal divergence level has been widened.

Bottom Line: An extensive evaluation of the benefits of secondary structure, however, is lacking.The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness.Koonin.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Bioinformatics, University of Würzburg, Am Hubland, 97074 Würzburg, Germany.

ABSTRACT

Background: In several studies, secondary structures of ribosomal genes have been used to improve the quality of phylogenetic reconstructions. An extensive evaluation of the benefits of secondary structure, however, is lacking.

Results: This is the first study to counter this deficiency. We inspected the accuracy and robustness of phylogenetics with individual secondary structures by simulation experiments for artificial tree topologies with up to 18 taxa and for divergency levels in the range of typical phylogenetic studies. We chose the internal transcribed spacer 2 of the ribosomal cistron as an exemplary marker region. Simulation integrated the coevolution process of sequences with secondary structures. Additionally, the phylogenetic power of marker size duplication was investigated and compared with sequence and sequence-structure reconstruction methods. The results clearly show that accuracy and robustness of Neighbor Joining trees are largely improved by structural information in contrast to sequence only data, whereas a doubled marker size only accounts for robustness.

Conclusions: Individual secondary structures of ribosomal RNA sequences provide a valuable gain of information content that is useful for phylogenetics. Thus, the usage of ITS2 sequence together with secondary structure for taxonomic inferences is recommended. Other reconstruction methods as maximum likelihood, bayesian inference or maximum parsimony may equally profit from secondary structure inclusion.

Reviewers: This article was reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin.

Open peer review: Reviewed by Shamil Sunyaev, Andrea Tanzer (nominated by Frank Eisenhaber) and Eugene V. Koonin. For the full reviews, please go to the Reviewers' comments section.

Show MeSH
Related in: MedlinePlus