Limits...
High resolution clustering of Salmonella enterica serovar Montevideo strains using a next-generation sequencing approach.

Allard MW, Luo Y, Strain E, Li C, Keys CE, Son I, Stones R, Musser SM, Brown EW - BMC Genomics (2012)

Bottom Line: In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data.Implementation of a validated pipeline for NGS data acquisition and analysis provides highly reproducible results that are stable and predictable for molecular epidemiological applications.This reproducibility applies to all levels within and between serovars of Salmonella suggesting that investigators using these methods can have confidence in their conclusions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Office of Regulatory Science, Center for Food Safety & Applied Nutrition, U,S, Food & Drug Administration, 5100 Paint Branch Parkway, College Park, MD 20740, USA. Marc.Allard@fda.hhs.gov

ABSTRACT

Background: Next-Generation Sequencing (NGS) is increasingly being used as a molecular epidemiologic tool for discerning ancestry and traceback of the most complicated, difficult to resolve bacterial pathogens. Making a linkage between possible food sources and clinical isolates requires distinguishing the suspected pathogen from an environmental background and placing the variation observed into the wider context of variation occurring within a serovar and among other closely related foodborne pathogens. Equally important is the need to validate these high resolution molecular tools for use in molecular epidemiologic traceback. Such efforts include the examination of strain cluster stability as well as the cumulative genetic effects of sub-culturing on these clusters. Numerous isolates of S. Montevideo were shot-gun sequenced including diverse lineage representatives as well as numerous replicate clones to determine how much variability is due to bias, sequencing error, and or the culturing of isolates. All new draft genomes were compared to 34 S. Montevideo isolates previously published during an NGS-based molecular epidemiological case study.

Results: Intraserovar lineages of S. Montevideo differ by thousands of SNPs, that are only slightly less than the number of SNPs observed between S. Montevideo and other distinct serovars. Much less variability was discovered within an individual S. Montevideo clade implicated in a recent foodborne outbreak as well as among individual NGS replicates. These findings were similar to previous reports documenting homopolymeric and deletion error rates with the Roche 454 GS Titanium technology. In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data.

Conclusions: Implementation of a validated pipeline for NGS data acquisition and analysis provides highly reproducible results that are stable and predictable for molecular epidemiological applications. When draft genomes are collected at 15×-20× coverage and passed through a quality filter as part of a data analysis pipeline, including sub-passaged replicates defined by a few SNPs, they can be accurately placed in a phylogenetic context. This reproducibility applies to all levels within and between serovars of Salmonella suggesting that investigators using these methods can have confidence in their conclusions.

Show MeSH

Related in: MedlinePlus

NGS discovery of unique SNPs and insertional genetic attributes found in a highly homogeneous strain of S. Montevideo from California (157_Clinical_CA). (A) Isolate names correspond to samples in Table 1, and gene names correspond to the ORFs containing informative SNPs among a single S. Montevideo outbreak clone in Table 3. A representative nucleotide site observed across 5 isolates is listed for each ORF. ORFs are mapped against a reference of S. Typhimurium strain LT2 with lines going to approximate chromosomal positions relative to the reference (numbers in mbp). (B) A comparative MAUVE analysis of isolate 157_Clinical_CA revealed the presence of a 100 kb insertion with homology to Enterobacterial phage D6. Here we compared the isolate to another more complete homologous relative, phage P1 to document the insertion site. Graphic is standard MAUVE format showing putative genes as boxes with arrows documenting insertions and rearrangements. Forward and reverse strands are on opposite sides of the mid-line.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3368722&req=5

Figure 4: NGS discovery of unique SNPs and insertional genetic attributes found in a highly homogeneous strain of S. Montevideo from California (157_Clinical_CA). (A) Isolate names correspond to samples in Table 1, and gene names correspond to the ORFs containing informative SNPs among a single S. Montevideo outbreak clone in Table 3. A representative nucleotide site observed across 5 isolates is listed for each ORF. ORFs are mapped against a reference of S. Typhimurium strain LT2 with lines going to approximate chromosomal positions relative to the reference (numbers in mbp). (B) A comparative MAUVE analysis of isolate 157_Clinical_CA revealed the presence of a 100 kb insertion with homology to Enterobacterial phage D6. Here we compared the isolate to another more complete homologous relative, phage P1 to document the insertion site. Graphic is standard MAUVE format showing putative genes as boxes with arrows documenting insertions and rearrangements. Forward and reverse strands are on opposite sides of the mid-line.

Mentions: Although the majority of isolates composing the spiced-meat S. Montevideo clone generally exhibited a common genome length, one isolate from California (S. Montevideo 157_Clinical_CA) retained a noticeably larger genome than other members of this lineage (Figure 2). In addition to being separated from other S. Montevideos associated with the spiced-meat contamination event by nine phylogenetically informative SNPs (Figure 4A), comparative analysis revealed the presence of a 100 kb insertion with substantial homology to Enterobacterial phage D6. Since phage D6 was incomplete in GenBank (No. AY753669), a MAUVE comparison to another homologous relative, phage P1 (No. NC_005856), was helpful in suggesting that this may represent a D6-like phage insertion into contig 104 in this particular S. Montevideo genome. Based on the known length of phage D6, this particular insertion in S. Montevideo strain 157 accounts for observed variation between this genome (~ 4.75 Mb) and the other spiced-meat S. Montevideo genomes reported here (~ 4.65 Mb). Moreover, this finding underscores the utility of whole-genome scanning technologies for placing the source of size polymorphisms between otherwise homogeneous strains of Salmonella.


High resolution clustering of Salmonella enterica serovar Montevideo strains using a next-generation sequencing approach.

Allard MW, Luo Y, Strain E, Li C, Keys CE, Son I, Stones R, Musser SM, Brown EW - BMC Genomics (2012)

NGS discovery of unique SNPs and insertional genetic attributes found in a highly homogeneous strain of S. Montevideo from California (157_Clinical_CA). (A) Isolate names correspond to samples in Table 1, and gene names correspond to the ORFs containing informative SNPs among a single S. Montevideo outbreak clone in Table 3. A representative nucleotide site observed across 5 isolates is listed for each ORF. ORFs are mapped against a reference of S. Typhimurium strain LT2 with lines going to approximate chromosomal positions relative to the reference (numbers in mbp). (B) A comparative MAUVE analysis of isolate 157_Clinical_CA revealed the presence of a 100 kb insertion with homology to Enterobacterial phage D6. Here we compared the isolate to another more complete homologous relative, phage P1 to document the insertion site. Graphic is standard MAUVE format showing putative genes as boxes with arrows documenting insertions and rearrangements. Forward and reverse strands are on opposite sides of the mid-line.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3368722&req=5

Figure 4: NGS discovery of unique SNPs and insertional genetic attributes found in a highly homogeneous strain of S. Montevideo from California (157_Clinical_CA). (A) Isolate names correspond to samples in Table 1, and gene names correspond to the ORFs containing informative SNPs among a single S. Montevideo outbreak clone in Table 3. A representative nucleotide site observed across 5 isolates is listed for each ORF. ORFs are mapped against a reference of S. Typhimurium strain LT2 with lines going to approximate chromosomal positions relative to the reference (numbers in mbp). (B) A comparative MAUVE analysis of isolate 157_Clinical_CA revealed the presence of a 100 kb insertion with homology to Enterobacterial phage D6. Here we compared the isolate to another more complete homologous relative, phage P1 to document the insertion site. Graphic is standard MAUVE format showing putative genes as boxes with arrows documenting insertions and rearrangements. Forward and reverse strands are on opposite sides of the mid-line.
Mentions: Although the majority of isolates composing the spiced-meat S. Montevideo clone generally exhibited a common genome length, one isolate from California (S. Montevideo 157_Clinical_CA) retained a noticeably larger genome than other members of this lineage (Figure 2). In addition to being separated from other S. Montevideos associated with the spiced-meat contamination event by nine phylogenetically informative SNPs (Figure 4A), comparative analysis revealed the presence of a 100 kb insertion with substantial homology to Enterobacterial phage D6. Since phage D6 was incomplete in GenBank (No. AY753669), a MAUVE comparison to another homologous relative, phage P1 (No. NC_005856), was helpful in suggesting that this may represent a D6-like phage insertion into contig 104 in this particular S. Montevideo genome. Based on the known length of phage D6, this particular insertion in S. Montevideo strain 157 accounts for observed variation between this genome (~ 4.75 Mb) and the other spiced-meat S. Montevideo genomes reported here (~ 4.65 Mb). Moreover, this finding underscores the utility of whole-genome scanning technologies for placing the source of size polymorphisms between otherwise homogeneous strains of Salmonella.

Bottom Line: In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data.Implementation of a validated pipeline for NGS data acquisition and analysis provides highly reproducible results that are stable and predictable for molecular epidemiological applications.This reproducibility applies to all levels within and between serovars of Salmonella suggesting that investigators using these methods can have confidence in their conclusions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Office of Regulatory Science, Center for Food Safety & Applied Nutrition, U,S, Food & Drug Administration, 5100 Paint Branch Parkway, College Park, MD 20740, USA. Marc.Allard@fda.hhs.gov

ABSTRACT

Background: Next-Generation Sequencing (NGS) is increasingly being used as a molecular epidemiologic tool for discerning ancestry and traceback of the most complicated, difficult to resolve bacterial pathogens. Making a linkage between possible food sources and clinical isolates requires distinguishing the suspected pathogen from an environmental background and placing the variation observed into the wider context of variation occurring within a serovar and among other closely related foodborne pathogens. Equally important is the need to validate these high resolution molecular tools for use in molecular epidemiologic traceback. Such efforts include the examination of strain cluster stability as well as the cumulative genetic effects of sub-culturing on these clusters. Numerous isolates of S. Montevideo were shot-gun sequenced including diverse lineage representatives as well as numerous replicate clones to determine how much variability is due to bias, sequencing error, and or the culturing of isolates. All new draft genomes were compared to 34 S. Montevideo isolates previously published during an NGS-based molecular epidemiological case study.

Results: Intraserovar lineages of S. Montevideo differ by thousands of SNPs, that are only slightly less than the number of SNPs observed between S. Montevideo and other distinct serovars. Much less variability was discovered within an individual S. Montevideo clade implicated in a recent foodborne outbreak as well as among individual NGS replicates. These findings were similar to previous reports documenting homopolymeric and deletion error rates with the Roche 454 GS Titanium technology. In no case, however, did variability associated with sequencing methods or sample preparations create inconsistencies with our current phylogenetic results or the subsequent molecular epidemiological evidence gleaned from these data.

Conclusions: Implementation of a validated pipeline for NGS data acquisition and analysis provides highly reproducible results that are stable and predictable for molecular epidemiological applications. When draft genomes are collected at 15×-20× coverage and passed through a quality filter as part of a data analysis pipeline, including sub-passaged replicates defined by a few SNPs, they can be accurately placed in a phylogenetic context. This reproducibility applies to all levels within and between serovars of Salmonella suggesting that investigators using these methods can have confidence in their conclusions.

Show MeSH
Related in: MedlinePlus