Limits...
Quest for Orthologs Entails Quest for Tree of Life: In Search of the Gene Stream.

Boeckmann B, Marcet-Houben M, Rees JA, Forslund K, Huerta-Cepas J, Muffato M, Yilmaz P, Xenarios I, Bork P, Lewis SE, Gabaldón T, Quest for Orthologs Species Tree Working Gro - Genome Biol Evol (2015)

Bottom Line: Topological differences are observed not only at deep speciation events, but also within younger clades, such as Hominidae, Rodentia, Laurasiatheria, or rosids.The evolutionary relationships of 27 archaea and bacteria are highly inconsistent.The largest concordant species tree includes 77 of the QfO reference organisms at the most.

View Article: PubMed Central - PubMed

Affiliation: Swiss-Prot, Swiss Institute of Bioinformatics, Geneva, Switzerland brigitte.boeckmann@isb-sib.ch.

Show MeSH
Box plot of gene tree fractions supporting species tree topologies at different consistency levels. Consistent species tree topologies with (L90) and without (L70) significant branch support are generally in compliance with the analyzed gene trees. The fraction of supporting gene trees drops considerably when species tree topologies are incongruent, once or more, between the species trees (L10, L30, L50). Consistency categories “L30” and AT were assigned for practical reasons. Level L30 is the default value for conflicting nodes prior to evaluation, and the two remaining nodes (Excavata, Proteobacteria) show on the one hand conflicting species topologies, on the other hand significant branch support in at least one of the species trees. Only a low fraction of our gene trees supports these speciation nodes. Category AT indicates alternative topologies suggested by the species trees, and results cover the range of conflicting levels (L10, L50); this makes sense because alternative topologies are incongruent with the consensus tree and between species trees. For each box plot, bottom of the box is the first quartile (Q1), top of the box is the third quartile (Q3), the middle bar is the median, whiskers represent the 1.5 interquartile range (IQR).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4524488&req=5

evv121-F4: Box plot of gene tree fractions supporting species tree topologies at different consistency levels. Consistent species tree topologies with (L90) and without (L70) significant branch support are generally in compliance with the analyzed gene trees. The fraction of supporting gene trees drops considerably when species tree topologies are incongruent, once or more, between the species trees (L10, L30, L50). Consistency categories “L30” and AT were assigned for practical reasons. Level L30 is the default value for conflicting nodes prior to evaluation, and the two remaining nodes (Excavata, Proteobacteria) show on the one hand conflicting species topologies, on the other hand significant branch support in at least one of the species trees. Only a low fraction of our gene trees supports these speciation nodes. Category AT indicates alternative topologies suggested by the species trees, and results cover the range of conflicting levels (L10, L50); this makes sense because alternative topologies are incongruent with the consensus tree and between species trees. For each box plot, bottom of the box is the first quartile (Q1), top of the box is the third quartile (Q3), the middle bar is the median, whiskers represent the 1.5 interquartile range (IQR).

Mentions: To assess the level of discordance between the different species trees and the individual gene trees, we compared each species tree node with 458,108 gene phylogenies built for proteins encoded in 65 QfO reference species (Huerta-Cepas, Capella-Gutierrez, et al. 2014). This analysis was performed on ToLc-147 as well as on each species tree. By analyzing results according to the assigned consistency levels, we show that gene tree topologies coincide more often with consistent nodes (consistency levels L90 and L70) in species trees than with conflicting ones (consistency levels L10, L30, and L50) (fig. 4). This trend is also observed in the individual box plots for each project (supplementary file S3, Supplementary Material online). Interestingly, species trees which differ from the consensus tree (hereafter named “alternative species topologies” and assigned category “AT” for practical reasons) occur as the most dispersed group. The range of its gene tree support values corresponds to that determined for incongruent nodes. In fact, category AT includes many prokaryotic speciation nodes which are still polytomous in ToLc-147 because of incongruent topologies in the species trees, thus explaining the comparatively tall box plot. Even when assuming ToLc-147 to present the true tree, gene trees congruent with alternative species topologies can be correct, for instance, when containing xenologs or pseudo-orthologs. ToLc-147 with annotated gene tree support is presented in supplementary file S4, Supplementary Material online.Fig. 4.—


Quest for Orthologs Entails Quest for Tree of Life: In Search of the Gene Stream.

Boeckmann B, Marcet-Houben M, Rees JA, Forslund K, Huerta-Cepas J, Muffato M, Yilmaz P, Xenarios I, Bork P, Lewis SE, Gabaldón T, Quest for Orthologs Species Tree Working Gro - Genome Biol Evol (2015)

Box plot of gene tree fractions supporting species tree topologies at different consistency levels. Consistent species tree topologies with (L90) and without (L70) significant branch support are generally in compliance with the analyzed gene trees. The fraction of supporting gene trees drops considerably when species tree topologies are incongruent, once or more, between the species trees (L10, L30, L50). Consistency categories “L30” and AT were assigned for practical reasons. Level L30 is the default value for conflicting nodes prior to evaluation, and the two remaining nodes (Excavata, Proteobacteria) show on the one hand conflicting species topologies, on the other hand significant branch support in at least one of the species trees. Only a low fraction of our gene trees supports these speciation nodes. Category AT indicates alternative topologies suggested by the species trees, and results cover the range of conflicting levels (L10, L50); this makes sense because alternative topologies are incongruent with the consensus tree and between species trees. For each box plot, bottom of the box is the first quartile (Q1), top of the box is the third quartile (Q3), the middle bar is the median, whiskers represent the 1.5 interquartile range (IQR).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4524488&req=5

evv121-F4: Box plot of gene tree fractions supporting species tree topologies at different consistency levels. Consistent species tree topologies with (L90) and without (L70) significant branch support are generally in compliance with the analyzed gene trees. The fraction of supporting gene trees drops considerably when species tree topologies are incongruent, once or more, between the species trees (L10, L30, L50). Consistency categories “L30” and AT were assigned for practical reasons. Level L30 is the default value for conflicting nodes prior to evaluation, and the two remaining nodes (Excavata, Proteobacteria) show on the one hand conflicting species topologies, on the other hand significant branch support in at least one of the species trees. Only a low fraction of our gene trees supports these speciation nodes. Category AT indicates alternative topologies suggested by the species trees, and results cover the range of conflicting levels (L10, L50); this makes sense because alternative topologies are incongruent with the consensus tree and between species trees. For each box plot, bottom of the box is the first quartile (Q1), top of the box is the third quartile (Q3), the middle bar is the median, whiskers represent the 1.5 interquartile range (IQR).
Mentions: To assess the level of discordance between the different species trees and the individual gene trees, we compared each species tree node with 458,108 gene phylogenies built for proteins encoded in 65 QfO reference species (Huerta-Cepas, Capella-Gutierrez, et al. 2014). This analysis was performed on ToLc-147 as well as on each species tree. By analyzing results according to the assigned consistency levels, we show that gene tree topologies coincide more often with consistent nodes (consistency levels L90 and L70) in species trees than with conflicting ones (consistency levels L10, L30, and L50) (fig. 4). This trend is also observed in the individual box plots for each project (supplementary file S3, Supplementary Material online). Interestingly, species trees which differ from the consensus tree (hereafter named “alternative species topologies” and assigned category “AT” for practical reasons) occur as the most dispersed group. The range of its gene tree support values corresponds to that determined for incongruent nodes. In fact, category AT includes many prokaryotic speciation nodes which are still polytomous in ToLc-147 because of incongruent topologies in the species trees, thus explaining the comparatively tall box plot. Even when assuming ToLc-147 to present the true tree, gene trees congruent with alternative species topologies can be correct, for instance, when containing xenologs or pseudo-orthologs. ToLc-147 with annotated gene tree support is presented in supplementary file S4, Supplementary Material online.Fig. 4.—

Bottom Line: Topological differences are observed not only at deep speciation events, but also within younger clades, such as Hominidae, Rodentia, Laurasiatheria, or rosids.The evolutionary relationships of 27 archaea and bacteria are highly inconsistent.The largest concordant species tree includes 77 of the QfO reference organisms at the most.

View Article: PubMed Central - PubMed

Affiliation: Swiss-Prot, Swiss Institute of Bioinformatics, Geneva, Switzerland brigitte.boeckmann@isb-sib.ch.

Show MeSH