Limits...
Targeted recovery of novel phylogenetic diversity from next-generation sequence data.

Lynch MD, Bartram AK, Neufeld JD - ISME J (2012)

Bottom Line: We combined BLASTN network analysis, phylogenetics and targeted primer design to amplify 16S rRNA gene sequences from unique potential bacterial lineages, comprising part of the rare biosphere from a multi-million sequence data set from an Arctic tundra soil sample.Demonstrating the feasibility of the protocol developed here, three of seven recovered phylogenetic lineages represented extremely divergent taxonomic entities.A comparison to twelve next-generation data sets from additional soils suggested persistent low-abundance distributions of these novel 16S rRNA genes.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology, University of Waterloo, Waterloo, ON, Canada.

ABSTRACT
Next-generation sequencing technologies have led to recognition of a so-called 'rare biosphere'. These microbial operational taxonomic units (OTUs) are defined by low relative abundance and may be specifically adapted to maintaining low population sizes. We hypothesized that mining of low-abundance next-generation 16S ribosomal RNA (rRNA) gene data would lead to the discovery of novel phylogenetic diversity, reflecting microorganisms not yet discovered by previous sampling efforts. Here, we test this hypothesis by combining molecular and bioinformatic approaches for targeted retrieval of phylogenetic novelty within rare biosphere OTUs. We combined BLASTN network analysis, phylogenetics and targeted primer design to amplify 16S rRNA gene sequences from unique potential bacterial lineages, comprising part of the rare biosphere from a multi-million sequence data set from an Arctic tundra soil sample. Demonstrating the feasibility of the protocol developed here, three of seven recovered phylogenetic lineages represented extremely divergent taxonomic entities. These divergent target sequences correspond to (a) a previously unknown lineage within the BRC1 candidate phylum, (b) a sister group to the early diverging and currently recognized monospecific Cyanobacteria Gloeobacter, a genus containing multiple plesiomorphic traits and (c) a highly divergent lineage phylogenetically resolved within mitochondria. A comparison to twelve next-generation data sets from additional soils suggested persistent low-abundance distributions of these novel 16S rRNA genes. The results demonstrate this sequence analysis and retrieval pipeline as applicable for exploring underrepresented phylogenetic novelty and recovering taxa that may represent significant steps in bacterial evolution.

Show MeSH

Related in: MedlinePlus

Heatmap representing identities of highest BLASTN hits for each amplified clone from putative novel lineages. Sequences amplified from Alert, NU; (a) excluding uncultured or environmental sequences and (b) unfiltered non-redundant NCBI (National Center for Biotechnology Information) database. Euk.=eukaryota, Chim.=chimeric sequence, NS=no sequences successfully obtained. Only six sequences were attempted for positive control (+) groups.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3475379&req=5

fig2: Heatmap representing identities of highest BLASTN hits for each amplified clone from putative novel lineages. Sequences amplified from Alert, NU; (a) excluding uncultured or environmental sequences and (b) unfiltered non-redundant NCBI (National Center for Biotechnology Information) database. Euk.=eukaryota, Chim.=chimeric sequence, NS=no sequences successfully obtained. Only six sequences were attempted for positive control (+) groups.

Mentions: Using a temperature gradient of annealing temperatures to identify stringent PCR conditions, near full-length SSU sequences were amplified and sequenced from seven of the eight experimental clades using custom UL primers (Supplementary Table S1). Only UL14 primers designed against a sister group to the Clostridium genus (Supplementary Figure S2) did not successfully amplify full-length 16S rRNA genes. UL primer design and PCR amplification were also conducted on two relatively abundant Alert library sequences, representing Acidobacteria and Alphaproteobacteria sequences, to serve as positive controls (Supplementary Table S2). Demonstrating the specificity of the targeted PCR, nearly all retrieved 16S rRNA gene sequences were associated with the anticipated V3 region because the Sanger sequence data directly adjacent to the PCR primers was consistent with the original V3-region sequence. However, five sequences from UL13 were associated with the Eukaryota (Figure 2) despite amplification with the prokaryote-specific 1512uR (Weisburg et al., 1991) reverse primer (Bartram et al., 2011). Subsequent investigations with RDP Probematch (Cole et al., 2007; Cole et al., 2009) revealed a surprisingly high identity for this primer against archaeal sequences, fully matching 75% of Archaea. The alternative reverse primer we used, 907R, matched only 0.39% of Archaea. RDP Probematch does not compare against Eukaryota sequences, but as Eukaryota 18S rRNA gene sequences are closer to Archaea than Bacteria it implies some identity of the primer with Eukaryota sequences. Two near full-length sequences were putative chimeras as determined by UCHIME and were excluded from analyses. In total, 84 clones were successfully sequenced for near full-length bacterial 16S rRNA genes.


Targeted recovery of novel phylogenetic diversity from next-generation sequence data.

Lynch MD, Bartram AK, Neufeld JD - ISME J (2012)

Heatmap representing identities of highest BLASTN hits for each amplified clone from putative novel lineages. Sequences amplified from Alert, NU; (a) excluding uncultured or environmental sequences and (b) unfiltered non-redundant NCBI (National Center for Biotechnology Information) database. Euk.=eukaryota, Chim.=chimeric sequence, NS=no sequences successfully obtained. Only six sequences were attempted for positive control (+) groups.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3475379&req=5

fig2: Heatmap representing identities of highest BLASTN hits for each amplified clone from putative novel lineages. Sequences amplified from Alert, NU; (a) excluding uncultured or environmental sequences and (b) unfiltered non-redundant NCBI (National Center for Biotechnology Information) database. Euk.=eukaryota, Chim.=chimeric sequence, NS=no sequences successfully obtained. Only six sequences were attempted for positive control (+) groups.
Mentions: Using a temperature gradient of annealing temperatures to identify stringent PCR conditions, near full-length SSU sequences were amplified and sequenced from seven of the eight experimental clades using custom UL primers (Supplementary Table S1). Only UL14 primers designed against a sister group to the Clostridium genus (Supplementary Figure S2) did not successfully amplify full-length 16S rRNA genes. UL primer design and PCR amplification were also conducted on two relatively abundant Alert library sequences, representing Acidobacteria and Alphaproteobacteria sequences, to serve as positive controls (Supplementary Table S2). Demonstrating the specificity of the targeted PCR, nearly all retrieved 16S rRNA gene sequences were associated with the anticipated V3 region because the Sanger sequence data directly adjacent to the PCR primers was consistent with the original V3-region sequence. However, five sequences from UL13 were associated with the Eukaryota (Figure 2) despite amplification with the prokaryote-specific 1512uR (Weisburg et al., 1991) reverse primer (Bartram et al., 2011). Subsequent investigations with RDP Probematch (Cole et al., 2007; Cole et al., 2009) revealed a surprisingly high identity for this primer against archaeal sequences, fully matching 75% of Archaea. The alternative reverse primer we used, 907R, matched only 0.39% of Archaea. RDP Probematch does not compare against Eukaryota sequences, but as Eukaryota 18S rRNA gene sequences are closer to Archaea than Bacteria it implies some identity of the primer with Eukaryota sequences. Two near full-length sequences were putative chimeras as determined by UCHIME and were excluded from analyses. In total, 84 clones were successfully sequenced for near full-length bacterial 16S rRNA genes.

Bottom Line: We combined BLASTN network analysis, phylogenetics and targeted primer design to amplify 16S rRNA gene sequences from unique potential bacterial lineages, comprising part of the rare biosphere from a multi-million sequence data set from an Arctic tundra soil sample.Demonstrating the feasibility of the protocol developed here, three of seven recovered phylogenetic lineages represented extremely divergent taxonomic entities.A comparison to twelve next-generation data sets from additional soils suggested persistent low-abundance distributions of these novel 16S rRNA genes.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology, University of Waterloo, Waterloo, ON, Canada.

ABSTRACT
Next-generation sequencing technologies have led to recognition of a so-called 'rare biosphere'. These microbial operational taxonomic units (OTUs) are defined by low relative abundance and may be specifically adapted to maintaining low population sizes. We hypothesized that mining of low-abundance next-generation 16S ribosomal RNA (rRNA) gene data would lead to the discovery of novel phylogenetic diversity, reflecting microorganisms not yet discovered by previous sampling efforts. Here, we test this hypothesis by combining molecular and bioinformatic approaches for targeted retrieval of phylogenetic novelty within rare biosphere OTUs. We combined BLASTN network analysis, phylogenetics and targeted primer design to amplify 16S rRNA gene sequences from unique potential bacterial lineages, comprising part of the rare biosphere from a multi-million sequence data set from an Arctic tundra soil sample. Demonstrating the feasibility of the protocol developed here, three of seven recovered phylogenetic lineages represented extremely divergent taxonomic entities. These divergent target sequences correspond to (a) a previously unknown lineage within the BRC1 candidate phylum, (b) a sister group to the early diverging and currently recognized monospecific Cyanobacteria Gloeobacter, a genus containing multiple plesiomorphic traits and (c) a highly divergent lineage phylogenetically resolved within mitochondria. A comparison to twelve next-generation data sets from additional soils suggested persistent low-abundance distributions of these novel 16S rRNA genes. The results demonstrate this sequence analysis and retrieval pipeline as applicable for exploring underrepresented phylogenetic novelty and recovering taxa that may represent significant steps in bacterial evolution.

Show MeSH
Related in: MedlinePlus