Limits...
The diversity of cyanobacterial metabolism: genome analysis of multiple phototrophic microorganisms.

Beck C, Knoop H, Axmann IM, Steuer R - BMC Genomics (2012)

Bottom Line: We describe genetic diversity found within cyanobacterial genomes, specifically with respect to metabolic functionality.Our results have direct implications for resource allocation and further sequencing projects.It can be extrapolated that the number of newly identified genes still significantly increases with increasing number of new sequenced genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for Theoretical Biology, Humboldt-University of Berlin, Invalidenstr, 43, D-10115 Berlin, Germany.

ABSTRACT

Background: Cyanobacteria are among the most abundant organisms on Earth and represent one of the oldest and most widespread clades known in modern phylogenetics. As the only known prokaryotes capable of oxygenic photosynthesis, cyanobacteria are considered to be a promising resource for renewable fuels and natural products. Our efforts to harness the sun's energy using cyanobacteria would greatly benefit from an increased understanding of the genomic diversity across multiple cyanobacterial strains. In this respect, the advent of novel sequencing techniques and the availability of several cyanobacterial genomes offers new opportunities for understanding microbial diversity and metabolic organization and evolution in diverse environments.

Results: Here, we report a whole genome comparison of multiple phototrophic cyanobacteria. We describe genetic diversity found within cyanobacterial genomes, specifically with respect to metabolic functionality. Our results are based on pair-wise comparison of protein sequences and concomitant construction of clusters of likely ortholog genes. We differentiate between core, shared and unique genes and show that the majority of genes are associated with a single genome. In contrast, genes with metabolic function are strongly overrepresented within the core genome that is common to all considered strains. The analysis of metabolic diversity within core carbon metabolism reveals parts of the metabolic networks that are highly conserved, as well as highly fragmented pathways.

Conclusions: Our results have direct implications for resource allocation and further sequencing projects. It can be extrapolated that the number of newly identified genes still significantly increases with increasing number of new sequenced genomes. Furthermore, genome analysis of multiple phototrophic strains allows us to obtain a detailed picture of metabolic diversity that can serve as a starting point for biotechnological applications and automated metabolic reconstructions.

Show MeSH

Related in: MedlinePlus

The cyanobacterial pan- and core-genome. Estimated size of core- (A) and pan- (B) genome with increasing number of considered genomes. To avoid dependency on strain order, the 16 cyanobacterial strains were arranged in random order. At each step, we recalculated the number of core CLOGs (CLOGs assigned to all strains included as yet) and pan CLOGs (all CLOGs as yet found in at least one of the included strains) genome. This procedure was repeated 1000 times, the median across all iterations is shown. The errorbars represent the 0.1 and 0.9 quantiles estimated from 1000 iterations.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3369817&req=5

Figure 3: The cyanobacterial pan- and core-genome. Estimated size of core- (A) and pan- (B) genome with increasing number of considered genomes. To avoid dependency on strain order, the 16 cyanobacterial strains were arranged in random order. At each step, we recalculated the number of core CLOGs (CLOGs assigned to all strains included as yet) and pan CLOGs (all CLOGs as yet found in at least one of the included strains) genome. This procedure was repeated 1000 times, the median across all iterations is shown. The errorbars represent the 0.1 and 0.9 quantiles estimated from 1000 iterations.

Mentions: Figure 3 shows the size of the cyanobacterial core- and pan-genome estimated from the 16 strains considered here. The total pan-genome of all 16 strains encompasses more than 2·104 ortholog clusters and the increase as a function of the number of genomes does not show substantial flattening of the curve (Figure 3B). With each newly included genome still more than approximately 500 novel ortholog clusters are added to the pan-genome. Given these rarefaction curves, it must be expected that sequencing of further cyanobacterial strains will still result in the discovery of a high number of as yet unknown genes, even when the number of sequenced genomes goes significantly beyond the number sequenced as yet. The results shown in Figures 2 and 3 give rise to two questions. First, what is the size of the total cyanobacterial pan-genome? And, second, what is the functional and evolutionary difference, if any, between the core, shared and unique genes? Both questions have been addressed in the recent literature but cannot be resolved with any certainty yet.


The diversity of cyanobacterial metabolism: genome analysis of multiple phototrophic microorganisms.

Beck C, Knoop H, Axmann IM, Steuer R - BMC Genomics (2012)

The cyanobacterial pan- and core-genome. Estimated size of core- (A) and pan- (B) genome with increasing number of considered genomes. To avoid dependency on strain order, the 16 cyanobacterial strains were arranged in random order. At each step, we recalculated the number of core CLOGs (CLOGs assigned to all strains included as yet) and pan CLOGs (all CLOGs as yet found in at least one of the included strains) genome. This procedure was repeated 1000 times, the median across all iterations is shown. The errorbars represent the 0.1 and 0.9 quantiles estimated from 1000 iterations.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3369817&req=5

Figure 3: The cyanobacterial pan- and core-genome. Estimated size of core- (A) and pan- (B) genome with increasing number of considered genomes. To avoid dependency on strain order, the 16 cyanobacterial strains were arranged in random order. At each step, we recalculated the number of core CLOGs (CLOGs assigned to all strains included as yet) and pan CLOGs (all CLOGs as yet found in at least one of the included strains) genome. This procedure was repeated 1000 times, the median across all iterations is shown. The errorbars represent the 0.1 and 0.9 quantiles estimated from 1000 iterations.
Mentions: Figure 3 shows the size of the cyanobacterial core- and pan-genome estimated from the 16 strains considered here. The total pan-genome of all 16 strains encompasses more than 2·104 ortholog clusters and the increase as a function of the number of genomes does not show substantial flattening of the curve (Figure 3B). With each newly included genome still more than approximately 500 novel ortholog clusters are added to the pan-genome. Given these rarefaction curves, it must be expected that sequencing of further cyanobacterial strains will still result in the discovery of a high number of as yet unknown genes, even when the number of sequenced genomes goes significantly beyond the number sequenced as yet. The results shown in Figures 2 and 3 give rise to two questions. First, what is the size of the total cyanobacterial pan-genome? And, second, what is the functional and evolutionary difference, if any, between the core, shared and unique genes? Both questions have been addressed in the recent literature but cannot be resolved with any certainty yet.

Bottom Line: We describe genetic diversity found within cyanobacterial genomes, specifically with respect to metabolic functionality.Our results have direct implications for resource allocation and further sequencing projects.It can be extrapolated that the number of newly identified genes still significantly increases with increasing number of new sequenced genomes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for Theoretical Biology, Humboldt-University of Berlin, Invalidenstr, 43, D-10115 Berlin, Germany.

ABSTRACT

Background: Cyanobacteria are among the most abundant organisms on Earth and represent one of the oldest and most widespread clades known in modern phylogenetics. As the only known prokaryotes capable of oxygenic photosynthesis, cyanobacteria are considered to be a promising resource for renewable fuels and natural products. Our efforts to harness the sun's energy using cyanobacteria would greatly benefit from an increased understanding of the genomic diversity across multiple cyanobacterial strains. In this respect, the advent of novel sequencing techniques and the availability of several cyanobacterial genomes offers new opportunities for understanding microbial diversity and metabolic organization and evolution in diverse environments.

Results: Here, we report a whole genome comparison of multiple phototrophic cyanobacteria. We describe genetic diversity found within cyanobacterial genomes, specifically with respect to metabolic functionality. Our results are based on pair-wise comparison of protein sequences and concomitant construction of clusters of likely ortholog genes. We differentiate between core, shared and unique genes and show that the majority of genes are associated with a single genome. In contrast, genes with metabolic function are strongly overrepresented within the core genome that is common to all considered strains. The analysis of metabolic diversity within core carbon metabolism reveals parts of the metabolic networks that are highly conserved, as well as highly fragmented pathways.

Conclusions: Our results have direct implications for resource allocation and further sequencing projects. It can be extrapolated that the number of newly identified genes still significantly increases with increasing number of new sequenced genomes. Furthermore, genome analysis of multiple phototrophic strains allows us to obtain a detailed picture of metabolic diversity that can serve as a starting point for biotechnological applications and automated metabolic reconstructions.

Show MeSH
Related in: MedlinePlus