Limits...
Environmental monitoring using next generation sequencing: rapid identification of macroinvertebrate bioindicator species.

Carew ME, Pettigrove VJ, Metzeling L, Hoffmann AA - Front. Zool. (2013)

Bottom Line: We find that 454 generated COI sequences successfully identified up to 96% of species in samples, but this increased up to 99% when combined with CytB sequences.We also found a strong quantitative relationship between the number of 454 sequences and individuals showing that it may be possible to estimate the abundance of species from 454 pyrosequencing data.Next generation sequencing using two genes was successful for identifying chironomid species.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Zoology, Victorian Centre for Aquatic Pollution Identification and Management (CAPIM), The University of Melbourne, Victoria 3010, Australia. mecarew@unimelb.edu.au.

ABSTRACT

Introduction: Invertebrate communities are central to many environmental monitoring programs. In freshwater ecosystems, aquatic macroinvertebrates are collected, identified and then used to infer ecosystem condition. Yet the key step of species identification is often not taken, as it requires a high level of taxonomic expertise, which is lacking in most organizations, or species cannot be identified as they are morphologically cryptic or represent little known groups. Identifying species using DNA sequences can overcome many of these issues; with the power of next generation sequencing (NGS), using DNA sequences for routine monitoring becomes feasible.

Results: In this study, we test if NGS can be used to identify species from field-collected samples in an important bioindicator group, the Chironomidae. We show that Cytochrome oxidase I (COI) and Cytochrome B (CytB) sequences provide accurate DNA barcodes for chironomid species. We then develop a NGS analysis pipeline to identifying species using megablast searches of high quality sequences generated using 454 pyrosequencing against comprehensive reference libraries of Sanger-sequenced voucher specimens. We find that 454 generated COI sequences successfully identified up to 96% of species in samples, but this increased up to 99% when combined with CytB sequences. Accurate identification depends on having at least five sequences for a species; below this level species not expected in samples were detected. Incorrect incorporation of some multiplex identifiers (MID's) used to tag samples was a likely cause, and most errors could be detected when using MID tags on forward and reverse primers. We also found a strong quantitative relationship between the number of 454 sequences and individuals showing that it may be possible to estimate the abundance of species from 454 pyrosequencing data.

Conclusions: Next generation sequencing using two genes was successful for identifying chironomid species. However, when detecting species from 454 pyrosequencing data sets it was critical to include known individuals for quality control and to establish thresholds for detecting species. The NGS approach developed here can lead to routine species-level diagnostic monitoring of aquatic ecosystems.

No MeSH data available.


Related in: MedlinePlus

Experimental design and data analysis pipeline. The first half of the pipeline (in blue) shows the experimental set up, where species in samples were amplified individually (using morphology, PCR-RFLP and Sanger sequencing) and in bulk using 454 pyrosequencing. The second half of the pipeline (in orange) deals with the analysis of the sequences generated with 454 pyrosequencing.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3750358&req=5

Figure 5: Experimental design and data analysis pipeline. The first half of the pipeline (in blue) shows the experimental set up, where species in samples were amplified individually (using morphology, PCR-RFLP and Sanger sequencing) and in bulk using 454 pyrosequencing. The second half of the pipeline (in orange) deals with the analysis of the sequences generated with 454 pyrosequencing.

Mentions: Reference DNA sequence databases, based on partial sequences from mitochondrial COI and CytB genes, were constructed with multiple individuals from over 120 Chironomidae species collected largely from south-eastern Australia. During construction of the databases, two COI primer combinations were used to amplify 658 bp of the COI gene. For recently processed samples we used the ‘COI barcoding primers’ HCO2189 and LCO1490 (Table 3, Additional file 3: Figure S1a) according to the PCR conditions in Krosch et al. [74], while for older samples the primers 911 and 912 were used according to the PCR conditions in Carew et al. [75] (Table 3, Figure 5). Two primer combinations were also used to amplify the CytB gene. The primers CB1 and T-N-S1 amplified between 742 bps and 837 bps of the 5’ end of the CytB gene and part of the length variable tRNA serine (Table 3, Additional file 3: Figure S1b). As these primers do not universally amplify CytB in the Chironomidae, we designed a second degenerate primer CB 549 R (Table 3, Additional file 3: Figure S1b), which when used with CB1 amplified 592 bp of the CytB gene. Both CytB fragments were amplified according to the PCR conditions in Carew et al. [73]. All PCR products were sequenced in both directions, with sequencing reactions and runs performed by Macrogen (Seoul, Korea). Forward and reverse sequences were aligned and manually edited in Sequencher (version 4.7, Genecodes, Ann Arbor, MI, USA). Consensus sequences for each individual were then exported as concatenated fasta files from Sequencher and imported into Geneious version 5.6.6 [76], where they were used as reference DNA databases to identify species from the 454 pyrosequencing experiment.


Environmental monitoring using next generation sequencing: rapid identification of macroinvertebrate bioindicator species.

Carew ME, Pettigrove VJ, Metzeling L, Hoffmann AA - Front. Zool. (2013)

Experimental design and data analysis pipeline. The first half of the pipeline (in blue) shows the experimental set up, where species in samples were amplified individually (using morphology, PCR-RFLP and Sanger sequencing) and in bulk using 454 pyrosequencing. The second half of the pipeline (in orange) deals with the analysis of the sequences generated with 454 pyrosequencing.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3750358&req=5

Figure 5: Experimental design and data analysis pipeline. The first half of the pipeline (in blue) shows the experimental set up, where species in samples were amplified individually (using morphology, PCR-RFLP and Sanger sequencing) and in bulk using 454 pyrosequencing. The second half of the pipeline (in orange) deals with the analysis of the sequences generated with 454 pyrosequencing.
Mentions: Reference DNA sequence databases, based on partial sequences from mitochondrial COI and CytB genes, were constructed with multiple individuals from over 120 Chironomidae species collected largely from south-eastern Australia. During construction of the databases, two COI primer combinations were used to amplify 658 bp of the COI gene. For recently processed samples we used the ‘COI barcoding primers’ HCO2189 and LCO1490 (Table 3, Additional file 3: Figure S1a) according to the PCR conditions in Krosch et al. [74], while for older samples the primers 911 and 912 were used according to the PCR conditions in Carew et al. [75] (Table 3, Figure 5). Two primer combinations were also used to amplify the CytB gene. The primers CB1 and T-N-S1 amplified between 742 bps and 837 bps of the 5’ end of the CytB gene and part of the length variable tRNA serine (Table 3, Additional file 3: Figure S1b). As these primers do not universally amplify CytB in the Chironomidae, we designed a second degenerate primer CB 549 R (Table 3, Additional file 3: Figure S1b), which when used with CB1 amplified 592 bp of the CytB gene. Both CytB fragments were amplified according to the PCR conditions in Carew et al. [73]. All PCR products were sequenced in both directions, with sequencing reactions and runs performed by Macrogen (Seoul, Korea). Forward and reverse sequences were aligned and manually edited in Sequencher (version 4.7, Genecodes, Ann Arbor, MI, USA). Consensus sequences for each individual were then exported as concatenated fasta files from Sequencher and imported into Geneious version 5.6.6 [76], where they were used as reference DNA databases to identify species from the 454 pyrosequencing experiment.

Bottom Line: We find that 454 generated COI sequences successfully identified up to 96% of species in samples, but this increased up to 99% when combined with CytB sequences.We also found a strong quantitative relationship between the number of 454 sequences and individuals showing that it may be possible to estimate the abundance of species from 454 pyrosequencing data.Next generation sequencing using two genes was successful for identifying chironomid species.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Zoology, Victorian Centre for Aquatic Pollution Identification and Management (CAPIM), The University of Melbourne, Victoria 3010, Australia. mecarew@unimelb.edu.au.

ABSTRACT

Introduction: Invertebrate communities are central to many environmental monitoring programs. In freshwater ecosystems, aquatic macroinvertebrates are collected, identified and then used to infer ecosystem condition. Yet the key step of species identification is often not taken, as it requires a high level of taxonomic expertise, which is lacking in most organizations, or species cannot be identified as they are morphologically cryptic or represent little known groups. Identifying species using DNA sequences can overcome many of these issues; with the power of next generation sequencing (NGS), using DNA sequences for routine monitoring becomes feasible.

Results: In this study, we test if NGS can be used to identify species from field-collected samples in an important bioindicator group, the Chironomidae. We show that Cytochrome oxidase I (COI) and Cytochrome B (CytB) sequences provide accurate DNA barcodes for chironomid species. We then develop a NGS analysis pipeline to identifying species using megablast searches of high quality sequences generated using 454 pyrosequencing against comprehensive reference libraries of Sanger-sequenced voucher specimens. We find that 454 generated COI sequences successfully identified up to 96% of species in samples, but this increased up to 99% when combined with CytB sequences. Accurate identification depends on having at least five sequences for a species; below this level species not expected in samples were detected. Incorrect incorporation of some multiplex identifiers (MID's) used to tag samples was a likely cause, and most errors could be detected when using MID tags on forward and reverse primers. We also found a strong quantitative relationship between the number of 454 sequences and individuals showing that it may be possible to estimate the abundance of species from 454 pyrosequencing data.

Conclusions: Next generation sequencing using two genes was successful for identifying chironomid species. However, when detecting species from 454 pyrosequencing data sets it was critical to include known individuals for quality control and to establish thresholds for detecting species. The NGS approach developed here can lead to routine species-level diagnostic monitoring of aquatic ecosystems.

No MeSH data available.


Related in: MedlinePlus