Limits...
Microbiome profiling by illumina sequencing of combinatorial sequence-tagged PCR products.

Gloor GB, Hummelen R, Macklaim JM, Dickson RJ, Fernandes AD, MacPhee R, Reid G - PLoS ONE (2010)

Bottom Line: The number of reads generated permitted saturating or near-saturating analysis of samples of the vaginal microbiome.We show that the short reads are sufficient to assign organisms to the genus or species level in most cases.We suggest that this method will be useful for the deep sequencing of any short nucleotide region that is taxonomically informative; these include the V3, V5 regions of the bacterial 16S rRNA genes and the eukaryotic V9 region that is gaining popularity for sampling protist diversity.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, University of Western Ontario, London, Ontario, Canada. ggloor@uwo.ca

ABSTRACT
We developed a low-cost, high-throughput microbiome profiling method that uses combinatorial sequence tags attached to PCR primers that amplify the rRNA V6 region. Amplified PCR products are sequenced using an Illumina paired-end protocol to generate millions of overlapping reads. Combinatorial sequence tagging can be used to examine hundreds of samples with far fewer primers than is required when sequence tags are incorporated at only a single end. The number of reads generated permitted saturating or near-saturating analysis of samples of the vaginal microbiome. The large number of reads allowed an in-depth analysis of errors, and we found that PCR-induced errors composed the vast majority of non-organism derived species variants, an observation that has significant implications for sequence clustering of similar high-throughput data. We show that the short reads are sufficient to assign organisms to the genus or species level in most cases. We suggest that this method will be useful for the deep sequencing of any short nucleotide region that is taxonomically informative; these include the V3, V5 regions of the bacterial 16S rRNA genes and the eukaryotic V9 region that is gaining popularity for sampling protist diversity.

Show MeSH

Related in: MedlinePlus

Plot of the reproducibility between and within samples.The black-filled circles plot within-sample variation, and the red circles plot the between-sample variation for the GTCGC tag. The count of sequences composing OTUs clustered at 95% identity for samples containing the GTCGC tag and the GTCG N-1 tag are in black. This shows the technical replication of the data when amplified from the same sample in the same tube. The open red circles plot the correspondence for between-sample OTU counts.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2964327&req=5

pone-0015406-g009: Plot of the reproducibility between and within samples.The black-filled circles plot within-sample variation, and the red circles plot the between-sample variation for the GTCGC tag. The count of sequences composing OTUs clustered at 95% identity for samples containing the GTCGC tag and the GTCG N-1 tag are in black. This shows the technical replication of the data when amplified from the same sample in the same tube. The open red circles plot the correspondence for between-sample OTU counts.

Mentions: We found that one right-end tag, GCGAG, was composed of a mixture with the ratio 69.5/30.5 of the full-length and the unique N-1 truncation-derived GCGA tag. This oligonucleotide synthesis error was exploited to determine the effect of the number of reads on within-sample variability; in essence the N-1 truncated tag allowed an examination of the technical replication of the experiment. The GCGAC tag was used in 17 samples. The black-filled circles in Figure 9 show the number of reads from the full length GCGAC tag compared to the truncated GCGA tag in these samples. The red open circles in Figure 9 show an example of the read replication observed from independent samples. The replication of the read numbers in the full length and N-1 samples is extremely high for reads occurring at least 30 times in the full-length tag set, and at least 10 reads in the N-1 tag set. As expected the read replication for independent samples is much poorer. The correlation coefficients for the 17 full-length and N-1 samples ranged from 0.97 to 0.99 when the N-1 sample contained at least 10 reads. Thus, we conclude that the number of reads in a sample is reproducible, if at least 10 reads are observed. Similar conclusions about the minimum read abundance have been drawn from RNA-seq experiments [27].


Microbiome profiling by illumina sequencing of combinatorial sequence-tagged PCR products.

Gloor GB, Hummelen R, Macklaim JM, Dickson RJ, Fernandes AD, MacPhee R, Reid G - PLoS ONE (2010)

Plot of the reproducibility between and within samples.The black-filled circles plot within-sample variation, and the red circles plot the between-sample variation for the GTCGC tag. The count of sequences composing OTUs clustered at 95% identity for samples containing the GTCGC tag and the GTCG N-1 tag are in black. This shows the technical replication of the data when amplified from the same sample in the same tube. The open red circles plot the correspondence for between-sample OTU counts.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2964327&req=5

pone-0015406-g009: Plot of the reproducibility between and within samples.The black-filled circles plot within-sample variation, and the red circles plot the between-sample variation for the GTCGC tag. The count of sequences composing OTUs clustered at 95% identity for samples containing the GTCGC tag and the GTCG N-1 tag are in black. This shows the technical replication of the data when amplified from the same sample in the same tube. The open red circles plot the correspondence for between-sample OTU counts.
Mentions: We found that one right-end tag, GCGAG, was composed of a mixture with the ratio 69.5/30.5 of the full-length and the unique N-1 truncation-derived GCGA tag. This oligonucleotide synthesis error was exploited to determine the effect of the number of reads on within-sample variability; in essence the N-1 truncated tag allowed an examination of the technical replication of the experiment. The GCGAC tag was used in 17 samples. The black-filled circles in Figure 9 show the number of reads from the full length GCGAC tag compared to the truncated GCGA tag in these samples. The red open circles in Figure 9 show an example of the read replication observed from independent samples. The replication of the read numbers in the full length and N-1 samples is extremely high for reads occurring at least 30 times in the full-length tag set, and at least 10 reads in the N-1 tag set. As expected the read replication for independent samples is much poorer. The correlation coefficients for the 17 full-length and N-1 samples ranged from 0.97 to 0.99 when the N-1 sample contained at least 10 reads. Thus, we conclude that the number of reads in a sample is reproducible, if at least 10 reads are observed. Similar conclusions about the minimum read abundance have been drawn from RNA-seq experiments [27].

Bottom Line: The number of reads generated permitted saturating or near-saturating analysis of samples of the vaginal microbiome.We show that the short reads are sufficient to assign organisms to the genus or species level in most cases.We suggest that this method will be useful for the deep sequencing of any short nucleotide region that is taxonomically informative; these include the V3, V5 regions of the bacterial 16S rRNA genes and the eukaryotic V9 region that is gaining popularity for sampling protist diversity.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, University of Western Ontario, London, Ontario, Canada. ggloor@uwo.ca

ABSTRACT
We developed a low-cost, high-throughput microbiome profiling method that uses combinatorial sequence tags attached to PCR primers that amplify the rRNA V6 region. Amplified PCR products are sequenced using an Illumina paired-end protocol to generate millions of overlapping reads. Combinatorial sequence tagging can be used to examine hundreds of samples with far fewer primers than is required when sequence tags are incorporated at only a single end. The number of reads generated permitted saturating or near-saturating analysis of samples of the vaginal microbiome. The large number of reads allowed an in-depth analysis of errors, and we found that PCR-induced errors composed the vast majority of non-organism derived species variants, an observation that has significant implications for sequence clustering of similar high-throughput data. We show that the short reads are sufficient to assign organisms to the genus or species level in most cases. We suggest that this method will be useful for the deep sequencing of any short nucleotide region that is taxonomically informative; these include the V3, V5 regions of the bacterial 16S rRNA genes and the eukaryotic V9 region that is gaining popularity for sampling protist diversity.

Show MeSH
Related in: MedlinePlus