Limits...
In silico genomic analyses reveal three distinct lineages of Escherichia coli O157:H7, one of which is associated with hyper-virulence.

Laing CR, Buchanan C, Taboada EN, Zhang Y, Karmali MA, Thomas JE, Gannon VP - BMC Genomics (2009)

Bottom Line: Each lineage contains significant phenotypic differences, with lineage I strains being the most commonly associated with human infections.The results of this study highlight the similarities in relationships derived from multi-locus genome sampling methods and suggest a "common genotyping language" may be devised for population genetics and epidemiological studies.Future genotyping methods should provide data that can be stored centrally and accessed locally in an easily transferable, informative and extensible format based on comparative genomic analyses.

View Article: PubMed Central - HTML - PubMed

Affiliation: Laboratory for Foodborne Zoonoses, Public Health Agency of Canada, Lethbridge, AB, Canada. chad_r_laing@phac-aspc.gc.ca

ABSTRACT

Background: Many approaches have been used to study the evolution, population structure and genetic diversity of Escherichia coli O157:H7; however, observations made with different genotyping systems are not easily relatable to each other. Three genetic lineages of E. coli O157:H7 designated I, II and I/II have been identified using octamer-based genome scanning and microarray comparative genomic hybridization (mCGH). Each lineage contains significant phenotypic differences, with lineage I strains being the most commonly associated with human infections. Similarly, a clade of hyper-virulent O157:H7 strains implicated in the 2006 spinach and lettuce outbreaks has been defined using single-nucleotide polymorphism (SNP) typing. In this study an in silico comparison of six different genotyping approaches was performed on 19 E. coli genome sequences from 17 O157:H7 strains and single O145:NM and K12 MG1655 strains to provide an overall picture of diversity of the E. coli O157:H7 population, and to compare genotyping methods for O157:H7 strains.

Results: In silico determination of lineage, Shiga-toxin bacteriophage integration site, comparative genomic fingerprint, mCGH profile, novel region distribution profile, SNP type and multi-locus variable number tandem repeat analysis type was performed and a supernetwork based on the combination of these methods was produced. This supernetwork showed three distinct clusters of strains that were O157:H7 lineage-specific, with the SNP-based hyper-virulent clade 8 synonymous with O157:H7 lineage I/II. Lineage I/II/clade 8 strains clustered closest on the supernetwork to E. coli K12 and E. coli O55:H7, O145:NM and sorbitol-fermenting O157 strains.

Conclusion: The results of this study highlight the similarities in relationships derived from multi-locus genome sampling methods and suggest a "common genotyping language" may be devised for population genetics and epidemiological studies. Future genotyping methods should provide data that can be stored centrally and accessed locally in an easily transferable, informative and extensible format based on comparative genomic analyses.

Show MeSH

Related in: MedlinePlus

Novel regions distribution among the E. coli strains in Table 1. The distribution of 1456 regions ~500 bp in size among 17 E. coli O157:H7 strains and the O145:NM strain EC33264 and K12 strain MG1655. Regions are not necessarily contiguous and are defined as novel based on less than 80% sequence identity to the genome of either  EDL933 or Sakai. Black indicates the presence of a region and white indicates the absence of a region.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2719669&req=5

Figure 1: Novel regions distribution among the E. coli strains in Table 1. The distribution of 1456 regions ~500 bp in size among 17 E. coli O157:H7 strains and the O145:NM strain EC33264 and K12 strain MG1655. Regions are not necessarily contiguous and are defined as novel based on less than 80% sequence identity to the genome of either EDL933 or Sakai. Black indicates the presence of a region and white indicates the absence of a region.

Mentions: The microarray used in the study by Zhang et al. [7] was based on lineage I strains EDL933 and Sakai and the K12 strain MG1655, so it is not surprising that the lineage I strains were found to contain more of the lineage I markers than strains from the other lineages. To account for novel genomic regions present in other O157:H7 strains, but not found in EDL933 or Sakai, an in silico subtractive hybridization was performed on every O157:H7 sequence in Table 1 against EDL933 and Sakai (Laing et al., in preparation). The study found 417 separate regions comprising 1456 segments of approximately 500 bp in length (~0.8 Mbp of novel DNA sequence). The distribution of these segments in silico is shown in Figure 1 and highlights the fact that lineage I strains possess genomic regions not represented in the original microarray probe set, as well as the fact that there are other lineage I/II and lineage II specific genomic regions.


In silico genomic analyses reveal three distinct lineages of Escherichia coli O157:H7, one of which is associated with hyper-virulence.

Laing CR, Buchanan C, Taboada EN, Zhang Y, Karmali MA, Thomas JE, Gannon VP - BMC Genomics (2009)

Novel regions distribution among the E. coli strains in Table 1. The distribution of 1456 regions ~500 bp in size among 17 E. coli O157:H7 strains and the O145:NM strain EC33264 and K12 strain MG1655. Regions are not necessarily contiguous and are defined as novel based on less than 80% sequence identity to the genome of either  EDL933 or Sakai. Black indicates the presence of a region and white indicates the absence of a region.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2719669&req=5

Figure 1: Novel regions distribution among the E. coli strains in Table 1. The distribution of 1456 regions ~500 bp in size among 17 E. coli O157:H7 strains and the O145:NM strain EC33264 and K12 strain MG1655. Regions are not necessarily contiguous and are defined as novel based on less than 80% sequence identity to the genome of either EDL933 or Sakai. Black indicates the presence of a region and white indicates the absence of a region.
Mentions: The microarray used in the study by Zhang et al. [7] was based on lineage I strains EDL933 and Sakai and the K12 strain MG1655, so it is not surprising that the lineage I strains were found to contain more of the lineage I markers than strains from the other lineages. To account for novel genomic regions present in other O157:H7 strains, but not found in EDL933 or Sakai, an in silico subtractive hybridization was performed on every O157:H7 sequence in Table 1 against EDL933 and Sakai (Laing et al., in preparation). The study found 417 separate regions comprising 1456 segments of approximately 500 bp in length (~0.8 Mbp of novel DNA sequence). The distribution of these segments in silico is shown in Figure 1 and highlights the fact that lineage I strains possess genomic regions not represented in the original microarray probe set, as well as the fact that there are other lineage I/II and lineage II specific genomic regions.

Bottom Line: Each lineage contains significant phenotypic differences, with lineage I strains being the most commonly associated with human infections.The results of this study highlight the similarities in relationships derived from multi-locus genome sampling methods and suggest a "common genotyping language" may be devised for population genetics and epidemiological studies.Future genotyping methods should provide data that can be stored centrally and accessed locally in an easily transferable, informative and extensible format based on comparative genomic analyses.

View Article: PubMed Central - HTML - PubMed

Affiliation: Laboratory for Foodborne Zoonoses, Public Health Agency of Canada, Lethbridge, AB, Canada. chad_r_laing@phac-aspc.gc.ca

ABSTRACT

Background: Many approaches have been used to study the evolution, population structure and genetic diversity of Escherichia coli O157:H7; however, observations made with different genotyping systems are not easily relatable to each other. Three genetic lineages of E. coli O157:H7 designated I, II and I/II have been identified using octamer-based genome scanning and microarray comparative genomic hybridization (mCGH). Each lineage contains significant phenotypic differences, with lineage I strains being the most commonly associated with human infections. Similarly, a clade of hyper-virulent O157:H7 strains implicated in the 2006 spinach and lettuce outbreaks has been defined using single-nucleotide polymorphism (SNP) typing. In this study an in silico comparison of six different genotyping approaches was performed on 19 E. coli genome sequences from 17 O157:H7 strains and single O145:NM and K12 MG1655 strains to provide an overall picture of diversity of the E. coli O157:H7 population, and to compare genotyping methods for O157:H7 strains.

Results: In silico determination of lineage, Shiga-toxin bacteriophage integration site, comparative genomic fingerprint, mCGH profile, novel region distribution profile, SNP type and multi-locus variable number tandem repeat analysis type was performed and a supernetwork based on the combination of these methods was produced. This supernetwork showed three distinct clusters of strains that were O157:H7 lineage-specific, with the SNP-based hyper-virulent clade 8 synonymous with O157:H7 lineage I/II. Lineage I/II/clade 8 strains clustered closest on the supernetwork to E. coli K12 and E. coli O55:H7, O145:NM and sorbitol-fermenting O157 strains.

Conclusion: The results of this study highlight the similarities in relationships derived from multi-locus genome sampling methods and suggest a "common genotyping language" may be devised for population genetics and epidemiological studies. Future genotyping methods should provide data that can be stored centrally and accessed locally in an easily transferable, informative and extensible format based on comparative genomic analyses.

Show MeSH
Related in: MedlinePlus