Limits...
An improved probability mapping approach to assess genome mosaicism.

Zhaxybayeva O, Gogarten JP - BMC Genomics (2003)

Bottom Line: The mapping of bootstrap support values from these extended datasets gives results similar to the original maximum likelihood and posterior probability mapping.Better taxon sampling combined with subtree analyses prevents the inconsistencies associated with four-taxon analyses, but retains the power of visual representation.Nevertheless, a case-by-case inspection of individual multi-taxon phylogenies remains necessary to differentiate unrecognized paralogy and shared phylogenetic reconstruction artifacts from horizontal gene transfer events.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular and Cell Biology, University of Connecticut, 91 North Eagleville Road, Storrs, CT 06269-3125, USA. olga@carrot.mcb.uconn.edu

ABSTRACT

Background: Maximum likelihood and posterior probability mapping are useful visualization techniques that are used to ascertain the mosaic nature of prokaryotic genomes. However, posterior probabilities, especially when calculated for four-taxon cases, tend to overestimate the support for tree topologies. Furthermore, because of poor taxon sampling four-taxon analyses suffer from sensitivity to the long branch attraction artifact. Here we extend the probability mapping approach by improving taxon sampling of the analyzed datasets, and by using bootstrap support values, a more conservative tool to assess reliability.

Results: Quartets of orthologous proteins were complemented with homologs from selected reference genomes. The mapping of bootstrap support values from these extended datasets gives results similar to the original maximum likelihood and posterior probability mapping. The more conservative nature of the plotted support values allows to focus further analyses on those protein families that strongly disagree with the majority or plurality of genes present in the analyzed genomes.

Conclusion: Posterior probability is a non-conservative measure for support, and posterior probability mapping only provides a quick estimation of phylogenetic information content of four genomes. This approach can be utilized as a pre-screen to select genes that might have been horizontally transferred. Better taxon sampling combined with subtree analyses prevents the inconsistencies associated with four-taxon analyses, but retains the power of visual representation. Nevertheless, a case-by-case inspection of individual multi-taxon phylogenies remains necessary to differentiate unrecognized paralogy and shared phylogenetic reconstruction artifacts from horizontal gene transfer events.

Show MeSH
Genome quartet of Synechocystis sp., Chlorobium tepidum, Rhodobacter capsulatus and Rhodopseudomonas palustris. A) Posterior probability map calculated using probability mapping as described in [4,17]. B) Bootstrap support map (see [4] for methodology of bootstrap support map reconstruction). Only the four putatively orthologous sequences were utilized in the analyses. C) Bootstrap support map from extended datasets. For details on the figure notations see legends for figures 1 and 3. The majority of QuartOPs support one tree topology grouping two alpha proteobacteria together. The QuartOPs located in the two other corners of the triangle are candidates for horizontal gene transfer.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC222983&req=5

Figure 4: Genome quartet of Synechocystis sp., Chlorobium tepidum, Rhodobacter capsulatus and Rhodopseudomonas palustris. A) Posterior probability map calculated using probability mapping as described in [4,17]. B) Bootstrap support map (see [4] for methodology of bootstrap support map reconstruction). Only the four putatively orthologous sequences were utilized in the analyses. C) Bootstrap support map from extended datasets. For details on the figure notations see legends for figures 1 and 3. The majority of QuartOPs support one tree topology grouping two alpha proteobacteria together. The QuartOPs located in the two other corners of the triangle are candidates for horizontal gene transfer.

Mentions: To assess the utility of probability and bootstrap support values mapping for detecting more recent interphylum gene transfer events, we calculated extended datasets for the genome quartet of Synechocystis sp., Chlorobium tepidum, Rhodobacter capsulatus and Rhodopseudomonas palustris (see figure 4 and [3]). This genome quartet has a strong phylogenetic signal grouping together the two alpha proteobacteria R. capsulatus and R. palustris. This example had previously been utilized to demonstrate the validity of ML mapping [3] showing that the vast majority of QuartOPs group the proteins from the two more closely related organisms together. However, there are 14 QuartOPs that support the two alternative topologies with 99% posterior probability. These have to be regarded as candidates for horizontal gene transfer. Analysis of this genome quartet using extended datasets shows that some of these 14 QuartOPs are also supported by high bootstrap support values (above 90%). Figures 5 through 8 provide further analysis of the extended datasets for these QuartOPs. The cases of the cation-transporting ATPases (figure 5) and the hypothetical proteins depicted in figure 6 probably represent unrecognized paralogies. The proteins from R. palustris and R. capsulatus each group together with homologs from other alpha proteobacteria, and in some instances a single genome encodes both paralogs (Bradyrhizobium japonicum in case of hypothetical protein family and Sinorhizobium meliloti in case of the cation-transporting ATPases). It appears likely that R. palustris has lost one and R. capsulatus the other paralog. In these two instances the unexpected behavior of the QuartOPs is due to failure of the strategy to select orthologous genes. In contrast, the cases of the water channel protein family and the methionyl-tRNA synthetases are best explained by horizontal gene transfer. None of the reference genomes contains paralogs whose differential loss might explain the observed phylogenies (figures 7 and 8).


An improved probability mapping approach to assess genome mosaicism.

Zhaxybayeva O, Gogarten JP - BMC Genomics (2003)

Genome quartet of Synechocystis sp., Chlorobium tepidum, Rhodobacter capsulatus and Rhodopseudomonas palustris. A) Posterior probability map calculated using probability mapping as described in [4,17]. B) Bootstrap support map (see [4] for methodology of bootstrap support map reconstruction). Only the four putatively orthologous sequences were utilized in the analyses. C) Bootstrap support map from extended datasets. For details on the figure notations see legends for figures 1 and 3. The majority of QuartOPs support one tree topology grouping two alpha proteobacteria together. The QuartOPs located in the two other corners of the triangle are candidates for horizontal gene transfer.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC222983&req=5

Figure 4: Genome quartet of Synechocystis sp., Chlorobium tepidum, Rhodobacter capsulatus and Rhodopseudomonas palustris. A) Posterior probability map calculated using probability mapping as described in [4,17]. B) Bootstrap support map (see [4] for methodology of bootstrap support map reconstruction). Only the four putatively orthologous sequences were utilized in the analyses. C) Bootstrap support map from extended datasets. For details on the figure notations see legends for figures 1 and 3. The majority of QuartOPs support one tree topology grouping two alpha proteobacteria together. The QuartOPs located in the two other corners of the triangle are candidates for horizontal gene transfer.
Mentions: To assess the utility of probability and bootstrap support values mapping for detecting more recent interphylum gene transfer events, we calculated extended datasets for the genome quartet of Synechocystis sp., Chlorobium tepidum, Rhodobacter capsulatus and Rhodopseudomonas palustris (see figure 4 and [3]). This genome quartet has a strong phylogenetic signal grouping together the two alpha proteobacteria R. capsulatus and R. palustris. This example had previously been utilized to demonstrate the validity of ML mapping [3] showing that the vast majority of QuartOPs group the proteins from the two more closely related organisms together. However, there are 14 QuartOPs that support the two alternative topologies with 99% posterior probability. These have to be regarded as candidates for horizontal gene transfer. Analysis of this genome quartet using extended datasets shows that some of these 14 QuartOPs are also supported by high bootstrap support values (above 90%). Figures 5 through 8 provide further analysis of the extended datasets for these QuartOPs. The cases of the cation-transporting ATPases (figure 5) and the hypothetical proteins depicted in figure 6 probably represent unrecognized paralogies. The proteins from R. palustris and R. capsulatus each group together with homologs from other alpha proteobacteria, and in some instances a single genome encodes both paralogs (Bradyrhizobium japonicum in case of hypothetical protein family and Sinorhizobium meliloti in case of the cation-transporting ATPases). It appears likely that R. palustris has lost one and R. capsulatus the other paralog. In these two instances the unexpected behavior of the QuartOPs is due to failure of the strategy to select orthologous genes. In contrast, the cases of the water channel protein family and the methionyl-tRNA synthetases are best explained by horizontal gene transfer. None of the reference genomes contains paralogs whose differential loss might explain the observed phylogenies (figures 7 and 8).

Bottom Line: The mapping of bootstrap support values from these extended datasets gives results similar to the original maximum likelihood and posterior probability mapping.Better taxon sampling combined with subtree analyses prevents the inconsistencies associated with four-taxon analyses, but retains the power of visual representation.Nevertheless, a case-by-case inspection of individual multi-taxon phylogenies remains necessary to differentiate unrecognized paralogy and shared phylogenetic reconstruction artifacts from horizontal gene transfer events.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular and Cell Biology, University of Connecticut, 91 North Eagleville Road, Storrs, CT 06269-3125, USA. olga@carrot.mcb.uconn.edu

ABSTRACT

Background: Maximum likelihood and posterior probability mapping are useful visualization techniques that are used to ascertain the mosaic nature of prokaryotic genomes. However, posterior probabilities, especially when calculated for four-taxon cases, tend to overestimate the support for tree topologies. Furthermore, because of poor taxon sampling four-taxon analyses suffer from sensitivity to the long branch attraction artifact. Here we extend the probability mapping approach by improving taxon sampling of the analyzed datasets, and by using bootstrap support values, a more conservative tool to assess reliability.

Results: Quartets of orthologous proteins were complemented with homologs from selected reference genomes. The mapping of bootstrap support values from these extended datasets gives results similar to the original maximum likelihood and posterior probability mapping. The more conservative nature of the plotted support values allows to focus further analyses on those protein families that strongly disagree with the majority or plurality of genes present in the analyzed genomes.

Conclusion: Posterior probability is a non-conservative measure for support, and posterior probability mapping only provides a quick estimation of phylogenetic information content of four genomes. This approach can be utilized as a pre-screen to select genes that might have been horizontally transferred. Better taxon sampling combined with subtree analyses prevents the inconsistencies associated with four-taxon analyses, but retains the power of visual representation. Nevertheless, a case-by-case inspection of individual multi-taxon phylogenies remains necessary to differentiate unrecognized paralogy and shared phylogenetic reconstruction artifacts from horizontal gene transfer events.

Show MeSH