Limits...
Microbiome profiling by illumina sequencing of combinatorial sequence-tagged PCR products.

Gloor GB, Hummelen R, Macklaim JM, Dickson RJ, Fernandes AD, MacPhee R, Reid G - PLoS ONE (2010)

Bottom Line: The number of reads generated permitted saturating or near-saturating analysis of samples of the vaginal microbiome.We show that the short reads are sufficient to assign organisms to the genus or species level in most cases.We suggest that this method will be useful for the deep sequencing of any short nucleotide region that is taxonomically informative; these include the V3, V5 regions of the bacterial 16S rRNA genes and the eukaryotic V9 region that is gaining popularity for sampling protist diversity.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, University of Western Ontario, London, Ontario, Canada. ggloor@uwo.ca

ABSTRACT
We developed a low-cost, high-throughput microbiome profiling method that uses combinatorial sequence tags attached to PCR primers that amplify the rRNA V6 region. Amplified PCR products are sequenced using an Illumina paired-end protocol to generate millions of overlapping reads. Combinatorial sequence tagging can be used to examine hundreds of samples with far fewer primers than is required when sequence tags are incorporated at only a single end. The number of reads generated permitted saturating or near-saturating analysis of samples of the vaginal microbiome. The large number of reads allowed an in-depth analysis of errors, and we found that PCR-induced errors composed the vast majority of non-organism derived species variants, an observation that has significant implications for sequence clustering of similar high-throughput data. We show that the short reads are sufficient to assign organisms to the genus or species level in most cases. We suggest that this method will be useful for the deep sequencing of any short nucleotide region that is taxonomically informative; these include the V3, V5 regions of the bacterial 16S rRNA genes and the eukaryotic V9 region that is gaining popularity for sampling protist diversity.

Show MeSH

Related in: MedlinePlus

The proportion of reads in the 25 most abundant OTUs clustered at 92% identity as a function of the number of differences with the seed ISU.The red line shows the plot for the concatenated primer sequences, and the blue line shows the plot for the OTU containing the most abundant ISU.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2964327&req=5

pone-0015406-g003: The proportion of reads in the 25 most abundant OTUs clustered at 92% identity as a function of the number of differences with the seed ISU.The red line shows the plot for the concatenated primer sequences, and the blue line shows the plot for the OTU containing the most abundant ISU.

Mentions: Figure 3 shows a plot of the number of reads in an OTU having mismatches compared to the most frequent read in the OTU at a cluster percentage of 92%. For an OTU with a length between 72–80 bp this corresponds to mismatches with the seed sequence. The red line in Figure 3 shows the plot for the 37 bp concatenated left and right primer sequences, which are expected to have half the per-nucleotide PCR-dependent error rate as the sequence between the primers, because 50% of the sequence is not derived de novo but is contributed by the primer sequence. Because the concatenated sequence is about one-half the length of the sequence between the primers, the overall slope of the primer line should approximate the slope of a single-species OTU that includes errors arising only from the PCR and sequencing. Note that the line for the primer sequence is nearly linear and, in line with our expectations, the number of reads having additional differences with the seed sequence for the OTU is far less abundant than the reads with one fewer difference. Also plotted are the results for the 25 most abundant OTUs, with OTU 0, the most abundant OTU comprising 51% of the total reads, shown in blue. The line for OTU 0, and several other OTUs closely follow the line for the concatenated primers until 4 or 5 differences with the seed sequence are included. The simplest interpretation is that one or more additional rare taxa having 4 or more mismatches with the seed sequence for OTU 0 are now being included at this level of clustering. The lines for 11 of the 25 OTUs show a similar pattern with a sharp increase at 4 or more mismatches. Only 3 of the OTUs show a continuous decline for all number of mismatches with the seed member of the OTU suggesting that clustering at 92% identity was including sequences not derived from PCR or sequencing error.


Microbiome profiling by illumina sequencing of combinatorial sequence-tagged PCR products.

Gloor GB, Hummelen R, Macklaim JM, Dickson RJ, Fernandes AD, MacPhee R, Reid G - PLoS ONE (2010)

The proportion of reads in the 25 most abundant OTUs clustered at 92% identity as a function of the number of differences with the seed ISU.The red line shows the plot for the concatenated primer sequences, and the blue line shows the plot for the OTU containing the most abundant ISU.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2964327&req=5

pone-0015406-g003: The proportion of reads in the 25 most abundant OTUs clustered at 92% identity as a function of the number of differences with the seed ISU.The red line shows the plot for the concatenated primer sequences, and the blue line shows the plot for the OTU containing the most abundant ISU.
Mentions: Figure 3 shows a plot of the number of reads in an OTU having mismatches compared to the most frequent read in the OTU at a cluster percentage of 92%. For an OTU with a length between 72–80 bp this corresponds to mismatches with the seed sequence. The red line in Figure 3 shows the plot for the 37 bp concatenated left and right primer sequences, which are expected to have half the per-nucleotide PCR-dependent error rate as the sequence between the primers, because 50% of the sequence is not derived de novo but is contributed by the primer sequence. Because the concatenated sequence is about one-half the length of the sequence between the primers, the overall slope of the primer line should approximate the slope of a single-species OTU that includes errors arising only from the PCR and sequencing. Note that the line for the primer sequence is nearly linear and, in line with our expectations, the number of reads having additional differences with the seed sequence for the OTU is far less abundant than the reads with one fewer difference. Also plotted are the results for the 25 most abundant OTUs, with OTU 0, the most abundant OTU comprising 51% of the total reads, shown in blue. The line for OTU 0, and several other OTUs closely follow the line for the concatenated primers until 4 or 5 differences with the seed sequence are included. The simplest interpretation is that one or more additional rare taxa having 4 or more mismatches with the seed sequence for OTU 0 are now being included at this level of clustering. The lines for 11 of the 25 OTUs show a similar pattern with a sharp increase at 4 or more mismatches. Only 3 of the OTUs show a continuous decline for all number of mismatches with the seed member of the OTU suggesting that clustering at 92% identity was including sequences not derived from PCR or sequencing error.

Bottom Line: The number of reads generated permitted saturating or near-saturating analysis of samples of the vaginal microbiome.We show that the short reads are sufficient to assign organisms to the genus or species level in most cases.We suggest that this method will be useful for the deep sequencing of any short nucleotide region that is taxonomically informative; these include the V3, V5 regions of the bacterial 16S rRNA genes and the eukaryotic V9 region that is gaining popularity for sampling protist diversity.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, University of Western Ontario, London, Ontario, Canada. ggloor@uwo.ca

ABSTRACT
We developed a low-cost, high-throughput microbiome profiling method that uses combinatorial sequence tags attached to PCR primers that amplify the rRNA V6 region. Amplified PCR products are sequenced using an Illumina paired-end protocol to generate millions of overlapping reads. Combinatorial sequence tagging can be used to examine hundreds of samples with far fewer primers than is required when sequence tags are incorporated at only a single end. The number of reads generated permitted saturating or near-saturating analysis of samples of the vaginal microbiome. The large number of reads allowed an in-depth analysis of errors, and we found that PCR-induced errors composed the vast majority of non-organism derived species variants, an observation that has significant implications for sequence clustering of similar high-throughput data. We show that the short reads are sufficient to assign organisms to the genus or species level in most cases. We suggest that this method will be useful for the deep sequencing of any short nucleotide region that is taxonomically informative; these include the V3, V5 regions of the bacterial 16S rRNA genes and the eukaryotic V9 region that is gaining popularity for sampling protist diversity.

Show MeSH
Related in: MedlinePlus