Solution hybrid selection capture for the recovery of functional full-length eukaryotic cDNAs from complex environmental samples.
Bottom Line: After two successive rounds of capture, >90% of the resulting cDNAs were GH11 sequences, of which 70% (38 among 53 sequenced genes) were full length.Sequencing of polymerase chain reaction-amplified GH11 gene fragments from the captured sequences highlighted hundreds of phylogenetically diverse sequences that were not yet described, in public databases.This protocol offers the possibility of performing exhaustive exploration of eukaryotic gene families within microbial communities thriving in any type of environment.
Affiliation: Department of Life Sciences and Systems Biology, University of Turin, viale Mattioli 25, Turin 10125, Italy Ecologie Microbienne, UMR CNRS 5557, USC INRA 1364, Université de Lyon, Université Lyon 1, Villeurbanne 69622, France.Show MeSH
Mentions: To evaluate the diversity of GH11 sequences at each step of the capture protocol, we performed a high-throughput Illumina MiSeq sequencing of GH11 amplicons obtained from all four cDNA samples, prior (H0) and after one (H1) or two (H2) cycles of SHS capture. Paired-end sequence reads were assembled to reconstitute the ca. 281-bp-long amplicons. Altogether, the total data set contained 334,161 full-length amplicon sequences that were clustered at a 95% nucleotide sequence identity threshold to produce a total number of 1,458 clusters, of which 1,001 (69%) were singletons (data summarized in Table 2 for each sample). Each of the 12 sequence data sets (4 cDNA samples × the 3 steps of the SHS) was characterized by few dominant clusters encompassing most of the sequences and a large number of clusters each containing a few, or even a single, sequences (illustrated in Fig. 3A for the PUE sample). None of the sequences obtained were identical to sequences deposited in databases. Only 17 of the sequence clusters, of which 14 exclusively from the BEW site, were >90% identical (maximum value of 97.5%) at the nucleotide level over their entire length to GH11 genes from either the Basidiomycota Tulasnella calospora or the Ascomycota Nectria haematococca and Pyrenophora teres.Table 2.
Affiliation: Department of Life Sciences and Systems Biology, University of Turin, viale Mattioli 25, Turin 10125, Italy Ecologie Microbienne, UMR CNRS 5557, USC INRA 1364, Université de Lyon, Université Lyon 1, Villeurbanne 69622, France.