Limits...
Assessing the accuracy of quantitative molecular microbial profiling.

O'Sullivan DM, Laver T, Temisak S, Redshaw N, Harris KA, Foy CA, Studholme DJ, Huggett JF - Int J Mol Sci (2014)

Bottom Line: Amplicon sequencing using four different primer strategies and two 16S rRNA regions was examined (Roche 454 Junior) and compared to WGS (Illumina HiSeq).This work provides a foundation for future work comparing relative differences between samples and the impact of extraction methods.We also highlight the value of control materials when conducting microbial profiling studies to benchmark methods and set appropriate thresholds.

View Article: PubMed Central - PubMed

Affiliation: Molecular Biology, LGC Ltd., Queens Road, Teddington TW11 0LY, UK. denise.osullivan@lgcgroup.com.

ABSTRACT
The application of high-throughput sequencing in profiling microbial communities is providing an unprecedented ability to investigate microbiomes. Such studies typically apply one of two methods: amplicon sequencing using PCR to target a conserved orthologous sequence (typically the 16S ribosomal RNA gene) or whole (meta)genome sequencing (WGS). Both methods have been used to catalog the microbial taxa present in a sample and quantify their respective abundances. However, a comparison of the inherent precision or bias of the different sequencing approaches has not been performed. We previously developed a metagenomic control material (MCM) to investigate error when performing different sequencing strategies. Amplicon sequencing using four different primer strategies and two 16S rRNA regions was examined (Roche 454 Junior) and compared to WGS (Illumina HiSeq). All sequencing methods generally performed comparably and in good agreement with organism specific digital PCR (dPCR); WGS notably demonstrated very high precision. Where discrepancies between relative abundances occurred they tended to differ by less than twofold. Our findings suggest that when alternative sequencing approaches are used for microbial molecular profiling they can perform with good reproducibility, but care should be taken when comparing small differences between distinct methods. This work provides a foundation for future work comparing relative differences between samples and the impact of extraction methods. We also highlight the value of control materials when conducting microbial profiling studies to benchmark methods and set appropriate thresholds.

Show MeSH
Relative copy number, expressed as a percentage of the metagenomic control material, from amplicon sequencing using different strategies amplifying the 16S rRNA. The error bars refer to the 95% confidence interval. Dashed blue line with a triangle represents the Qubit value, the dark red dashed line with a box represents the dPCR value, the purple line with cross is replicate 1, blue line with asterix replicate 2 and orange line with triangle replicate 3 of the sequencing approaches. (a) The α strategy which targets the Gram-negative members of the MCM; (b) expanding the results for the Gram-negative species and (c) the β strategy using multiple forward primers with a single reverse primer to target 16S rRNA variable regions 1 and 2; (d) The γ strategy using a degenerate primer and (e) δ strategy using the ability of T to bind to G and vice versa targeting 16S rRNA variable regions 4, 5 and 6; (f) Expresses relative composition of the MCM as determined by whole genome sequencing.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4264237&req=5

ijms-15-21476-f003: Relative copy number, expressed as a percentage of the metagenomic control material, from amplicon sequencing using different strategies amplifying the 16S rRNA. The error bars refer to the 95% confidence interval. Dashed blue line with a triangle represents the Qubit value, the dark red dashed line with a box represents the dPCR value, the purple line with cross is replicate 1, blue line with asterix replicate 2 and orange line with triangle replicate 3 of the sequencing approaches. (a) The α strategy which targets the Gram-negative members of the MCM; (b) expanding the results for the Gram-negative species and (c) the β strategy using multiple forward primers with a single reverse primer to target 16S rRNA variable regions 1 and 2; (d) The γ strategy using a degenerate primer and (e) δ strategy using the ability of T to bind to G and vice versa targeting 16S rRNA variable regions 4, 5 and 6; (f) Expresses relative composition of the MCM as determined by whole genome sequencing.

Mentions: Using our standard bioinformatics approach for the amplicon sequencing of assigning species based on aligning against a database of the known MCM species 16S rRNA sequences, we noted two factors when strategy α was employed. Firstly there was considerably more inter-run variation (Figure 3a and as demonstrated by coefficient of variation in Table 1), although this appeared to be predominantly due to the increased error associated with the Gram-positive bacteria to which the primers were not specific. Secondly as the primers for strategy α bound perfectly to the conserved regions of the Gram-negative bacteria we were able to assess if differences in the variable regions (Figure S3) would lead to bias in a primer independent manner. Figure 3b demonstrated good agreement with dPCR suggesting the sequence differences present in the respective variable regions (Figure S3) did not lead to considerable bias. When a mixture of specific primers were used (strategy β) the precision was considerably improved (Table 1) and there was better agreement with the dPCR (Figure 3c). When a different variable region was investigated using strategies γ and δ we noted that there was little difference with the respective methods both being more precise than strategy α (Figure 3d,e). An earlier investigation into sources of PCR bias demonstrated choice of primers had a major influence [24], we demonstrate here that different variations of the same priming sites can also have the same effect.


Assessing the accuracy of quantitative molecular microbial profiling.

O'Sullivan DM, Laver T, Temisak S, Redshaw N, Harris KA, Foy CA, Studholme DJ, Huggett JF - Int J Mol Sci (2014)

Relative copy number, expressed as a percentage of the metagenomic control material, from amplicon sequencing using different strategies amplifying the 16S rRNA. The error bars refer to the 95% confidence interval. Dashed blue line with a triangle represents the Qubit value, the dark red dashed line with a box represents the dPCR value, the purple line with cross is replicate 1, blue line with asterix replicate 2 and orange line with triangle replicate 3 of the sequencing approaches. (a) The α strategy which targets the Gram-negative members of the MCM; (b) expanding the results for the Gram-negative species and (c) the β strategy using multiple forward primers with a single reverse primer to target 16S rRNA variable regions 1 and 2; (d) The γ strategy using a degenerate primer and (e) δ strategy using the ability of T to bind to G and vice versa targeting 16S rRNA variable regions 4, 5 and 6; (f) Expresses relative composition of the MCM as determined by whole genome sequencing.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4264237&req=5

ijms-15-21476-f003: Relative copy number, expressed as a percentage of the metagenomic control material, from amplicon sequencing using different strategies amplifying the 16S rRNA. The error bars refer to the 95% confidence interval. Dashed blue line with a triangle represents the Qubit value, the dark red dashed line with a box represents the dPCR value, the purple line with cross is replicate 1, blue line with asterix replicate 2 and orange line with triangle replicate 3 of the sequencing approaches. (a) The α strategy which targets the Gram-negative members of the MCM; (b) expanding the results for the Gram-negative species and (c) the β strategy using multiple forward primers with a single reverse primer to target 16S rRNA variable regions 1 and 2; (d) The γ strategy using a degenerate primer and (e) δ strategy using the ability of T to bind to G and vice versa targeting 16S rRNA variable regions 4, 5 and 6; (f) Expresses relative composition of the MCM as determined by whole genome sequencing.
Mentions: Using our standard bioinformatics approach for the amplicon sequencing of assigning species based on aligning against a database of the known MCM species 16S rRNA sequences, we noted two factors when strategy α was employed. Firstly there was considerably more inter-run variation (Figure 3a and as demonstrated by coefficient of variation in Table 1), although this appeared to be predominantly due to the increased error associated with the Gram-positive bacteria to which the primers were not specific. Secondly as the primers for strategy α bound perfectly to the conserved regions of the Gram-negative bacteria we were able to assess if differences in the variable regions (Figure S3) would lead to bias in a primer independent manner. Figure 3b demonstrated good agreement with dPCR suggesting the sequence differences present in the respective variable regions (Figure S3) did not lead to considerable bias. When a mixture of specific primers were used (strategy β) the precision was considerably improved (Table 1) and there was better agreement with the dPCR (Figure 3c). When a different variable region was investigated using strategies γ and δ we noted that there was little difference with the respective methods both being more precise than strategy α (Figure 3d,e). An earlier investigation into sources of PCR bias demonstrated choice of primers had a major influence [24], we demonstrate here that different variations of the same priming sites can also have the same effect.

Bottom Line: Amplicon sequencing using four different primer strategies and two 16S rRNA regions was examined (Roche 454 Junior) and compared to WGS (Illumina HiSeq).This work provides a foundation for future work comparing relative differences between samples and the impact of extraction methods.We also highlight the value of control materials when conducting microbial profiling studies to benchmark methods and set appropriate thresholds.

View Article: PubMed Central - PubMed

Affiliation: Molecular Biology, LGC Ltd., Queens Road, Teddington TW11 0LY, UK. denise.osullivan@lgcgroup.com.

ABSTRACT
The application of high-throughput sequencing in profiling microbial communities is providing an unprecedented ability to investigate microbiomes. Such studies typically apply one of two methods: amplicon sequencing using PCR to target a conserved orthologous sequence (typically the 16S ribosomal RNA gene) or whole (meta)genome sequencing (WGS). Both methods have been used to catalog the microbial taxa present in a sample and quantify their respective abundances. However, a comparison of the inherent precision or bias of the different sequencing approaches has not been performed. We previously developed a metagenomic control material (MCM) to investigate error when performing different sequencing strategies. Amplicon sequencing using four different primer strategies and two 16S rRNA regions was examined (Roche 454 Junior) and compared to WGS (Illumina HiSeq). All sequencing methods generally performed comparably and in good agreement with organism specific digital PCR (dPCR); WGS notably demonstrated very high precision. Where discrepancies between relative abundances occurred they tended to differ by less than twofold. Our findings suggest that when alternative sequencing approaches are used for microbial molecular profiling they can perform with good reproducibility, but care should be taken when comparing small differences between distinct methods. This work provides a foundation for future work comparing relative differences between samples and the impact of extraction methods. We also highlight the value of control materials when conducting microbial profiling studies to benchmark methods and set appropriate thresholds.

Show MeSH