Limits...
Comparative supragenomic analyses among the pathogens Staphylococcus aureus, Streptococcus pneumoniae, and Haemophilus influenzae using a modification of the finite supragenome model.

Boissy R, Ahmed A, Janto B, Earl J, Hall BG, Hogg JS, Pusch GD, Hiller LN, Powell E, Hayes J, Yu S, Kathju S, Stoodley P, Post JC, Ehrlich GD, Hu FZ - BMC Genomics (2011)

Bottom Line: We developed a revised version of our finite supragenome model to estimate the size of the S. aureus supragenome (3,221 genes, with 2,245 core genes), and compared it with those of Haemophilus influenzae and Streptococcus pneumoniae.Using a multi-species comparative supragenomic analysis enabled by an improved version of our finite supragenome model we provide data and an interpretation explaining the relatively larger core genome of S. aureus compared to other opportunistic nasopharyngeal pathogens.In addition, we provide independent validation for the efficiency and effectiveness of our orthologous gene clustering algorithm.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Genomic Sciences, Allegheny-Singer Research Institute, Pittsburgh, PA 15212, USA.

ABSTRACT

Background: Staphylococcus aureus is associated with a spectrum of symbiotic relationships with its human host from carriage to sepsis and is frequently associated with nosocomial and community-acquired infections, thus the differential gene content among strains is of interest.

Results: We sequenced three clinical strains and combined these data with 13 publically available human isolates and one bovine strain for comparative genomic analyses. All genomes were annotated using RAST, and then their gene similarities and differences were delineated. Gene clustering yielded 3,155 orthologous gene clusters, of which 2,266 were core, 755 were distributed, and 134 were unique. Individual genomes contained between 2,524 and 2,648 genes. Gene-content comparisons among all possible S. aureus strain pairs (n = 136) revealed a mean difference of 296 genes and a maximum difference of 476 genes. We developed a revised version of our finite supragenome model to estimate the size of the S. aureus supragenome (3,221 genes, with 2,245 core genes), and compared it with those of Haemophilus influenzae and Streptococcus pneumoniae. There was excellent agreement between RAST's annotations and our CDS clustering procedure providing for high fidelity metabolomic subsystem analyses to extend our comparative genomic characterization of these strains.

Conclusions: Using a multi-species comparative supragenomic analysis enabled by an improved version of our finite supragenome model we provide data and an interpretation explaining the relatively larger core genome of S. aureus compared to other opportunistic nasopharyngeal pathogens. In addition, we provide independent validation for the efficiency and effectiveness of our orthologous gene clustering algorithm.

Show MeSH

Related in: MedlinePlus

Histogram of observed sample gene frequencies compared to the predicted number using the finite supragenome model. The number of genes for each frequency class was calculated using the results from our revised finite supragenome model (trained on all 17 strains). The observed and predicted number of core genes (2,266) found in all 17 strains agreed exactly; these values are not shown to avoid distortion of the scale of the graph. Distributed genes appear in two or more strains, but not all (from 2 to 16 here).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3094309&req=5

Figure 4: Histogram of observed sample gene frequencies compared to the predicted number using the finite supragenome model. The number of genes for each frequency class was calculated using the results from our revised finite supragenome model (trained on all 17 strains). The observed and predicted number of core genes (2,266) found in all 17 strains agreed exactly; these values are not shown to avoid distortion of the scale of the graph. Distributed genes appear in two or more strains, but not all (from 2 to 16 here).

Mentions: The finite supragenome model is predictive as well as descriptive, Figure 4 shows the excellent correlation between the observed sample gene frequency data from the 17 S. aureus genomes under study (the number of genes observed in exactly n = 1, 2, ..., 17 of these genomes) and the same values predicted using the values of μ, π, and N obtained using our revised finite supragenome model trained on the sample data (all 17 strains). Figure 5 (lower panels) shows the ability of the model to predict the numbers of new, core, and the total number of chromosomal genes that should be detectable after sequencing up to 30 S. aureus genomes. These results agree very well with those obtained using the results from our analyses of the 17 S. aureus genomes under study (Figure 5, upper panels). They also indicate that the sequencing of 30 S. aureus genomes will yield 99.5% of the total number and 99.4% of the core chromosomal genes in this species' supragenome (N = 3,221 genes).


Comparative supragenomic analyses among the pathogens Staphylococcus aureus, Streptococcus pneumoniae, and Haemophilus influenzae using a modification of the finite supragenome model.

Boissy R, Ahmed A, Janto B, Earl J, Hall BG, Hogg JS, Pusch GD, Hiller LN, Powell E, Hayes J, Yu S, Kathju S, Stoodley P, Post JC, Ehrlich GD, Hu FZ - BMC Genomics (2011)

Histogram of observed sample gene frequencies compared to the predicted number using the finite supragenome model. The number of genes for each frequency class was calculated using the results from our revised finite supragenome model (trained on all 17 strains). The observed and predicted number of core genes (2,266) found in all 17 strains agreed exactly; these values are not shown to avoid distortion of the scale of the graph. Distributed genes appear in two or more strains, but not all (from 2 to 16 here).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3094309&req=5

Figure 4: Histogram of observed sample gene frequencies compared to the predicted number using the finite supragenome model. The number of genes for each frequency class was calculated using the results from our revised finite supragenome model (trained on all 17 strains). The observed and predicted number of core genes (2,266) found in all 17 strains agreed exactly; these values are not shown to avoid distortion of the scale of the graph. Distributed genes appear in two or more strains, but not all (from 2 to 16 here).
Mentions: The finite supragenome model is predictive as well as descriptive, Figure 4 shows the excellent correlation between the observed sample gene frequency data from the 17 S. aureus genomes under study (the number of genes observed in exactly n = 1, 2, ..., 17 of these genomes) and the same values predicted using the values of μ, π, and N obtained using our revised finite supragenome model trained on the sample data (all 17 strains). Figure 5 (lower panels) shows the ability of the model to predict the numbers of new, core, and the total number of chromosomal genes that should be detectable after sequencing up to 30 S. aureus genomes. These results agree very well with those obtained using the results from our analyses of the 17 S. aureus genomes under study (Figure 5, upper panels). They also indicate that the sequencing of 30 S. aureus genomes will yield 99.5% of the total number and 99.4% of the core chromosomal genes in this species' supragenome (N = 3,221 genes).

Bottom Line: We developed a revised version of our finite supragenome model to estimate the size of the S. aureus supragenome (3,221 genes, with 2,245 core genes), and compared it with those of Haemophilus influenzae and Streptococcus pneumoniae.Using a multi-species comparative supragenomic analysis enabled by an improved version of our finite supragenome model we provide data and an interpretation explaining the relatively larger core genome of S. aureus compared to other opportunistic nasopharyngeal pathogens.In addition, we provide independent validation for the efficiency and effectiveness of our orthologous gene clustering algorithm.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Genomic Sciences, Allegheny-Singer Research Institute, Pittsburgh, PA 15212, USA.

ABSTRACT

Background: Staphylococcus aureus is associated with a spectrum of symbiotic relationships with its human host from carriage to sepsis and is frequently associated with nosocomial and community-acquired infections, thus the differential gene content among strains is of interest.

Results: We sequenced three clinical strains and combined these data with 13 publically available human isolates and one bovine strain for comparative genomic analyses. All genomes were annotated using RAST, and then their gene similarities and differences were delineated. Gene clustering yielded 3,155 orthologous gene clusters, of which 2,266 were core, 755 were distributed, and 134 were unique. Individual genomes contained between 2,524 and 2,648 genes. Gene-content comparisons among all possible S. aureus strain pairs (n = 136) revealed a mean difference of 296 genes and a maximum difference of 476 genes. We developed a revised version of our finite supragenome model to estimate the size of the S. aureus supragenome (3,221 genes, with 2,245 core genes), and compared it with those of Haemophilus influenzae and Streptococcus pneumoniae. There was excellent agreement between RAST's annotations and our CDS clustering procedure providing for high fidelity metabolomic subsystem analyses to extend our comparative genomic characterization of these strains.

Conclusions: Using a multi-species comparative supragenomic analysis enabled by an improved version of our finite supragenome model we provide data and an interpretation explaining the relatively larger core genome of S. aureus compared to other opportunistic nasopharyngeal pathogens. In addition, we provide independent validation for the efficiency and effectiveness of our orthologous gene clustering algorithm.

Show MeSH
Related in: MedlinePlus