Limits...
Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics.

Linard B, Crampton-Platt A, Gillett CP, Timmermans MJ, Vogler AP - Genome Biol Evol (2015)

Bottom Line: In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear.Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts.The "metagenome skimming" approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics.

View Article: PubMed Central - PubMed

Affiliation: Department of Life Sciences, Natural History Museum, London, United Kingdom.

Show MeSH

Related in: MedlinePlus

Phylogenetic content of the metagenomes. (A) Circle intersections of scaffold overlap among the different libraries. (B) Classification of scaffolds based on best hits in genome databases. See text for details of the five categories. The y axis represents the absolute number of scaffolds. Circle size represents category proportion relative to the total number of scaffolds in a library. (C) Circle intersections of scaffold overlap among the different libraries for scaffolds assigned to Hexapoda.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4494052&req=5

evv086-F2: Phylogenetic content of the metagenomes. (A) Circle intersections of scaffold overlap among the different libraries. (B) Classification of scaffolds based on best hits in genome databases. See text for details of the five categories. The y axis represents the absolute number of scaffolds. Circle size represents category proportion relative to the total number of scaffolds in a library. (C) Circle intersections of scaffold overlap among the different libraries for scaffolds assigned to Hexapoda.

Mentions: The Truseq libraries (Weevil, Canopy_Long, Canopy_Short) produced 17.3–23.9 M reads pairs and the Nextera library (Canopy_Next) produced 7.3 M reads. Following trimming, 30% of reads were discarded in the three Canopy libraries and 5% in the Weevil library (table 1). Assembly of the four Illumina libraries each produced between 20,000 and nearly 100,000 contigs and numbers were only slightly lower for (noncontiguous) scaffolds (table 1). Using the same DNA pool, both TruSeq libraries resulted in more than twice the number of reads as the Nextera library, and Canopy_Long assembled almost twice as many contigs and scaffolds as Canopy_Short and over three times as many as Canopy_Next. The Weevil pool produced the largest number of scaffolds despite containing the second lowest number of reads, whereby long insert size and greater homogeneity of read numbers from equimolar DNA samples may have aided the assembly. We determined intersections of library contents with pairwise alignments of the scaffolds (fig. 2A). The scaffolds of the three Canopy libraries were aligned with a stringent threshold of sequence identity >90%, E < 1e-18, alignment length >250 bp. In total, 19,297 scaffolds were shared by at least two Canopy libraries, and the tripartite intersection showed a core of 6,940 scaffolds (11–35% of the libraries) that was consistently recovered despite the low-coverage sequencing (fig. 2A, left). We performed a similar pairwise alignment between the Weevil library and the scaffold collection of all Canopy libraries (Canopy_merged), with a slightly lower threshold (sequence identity >80%, E < 1e-18, alignment length >250 bp) to recover potential homologs among different species (fig. 2A, right). A total of 5,174 scaffolds were shared by both samples (5.8% of Weevil scaffolds; 4.7% of Canopy scaffolds), showing that thousands of similar scaffolds can also be recovered between pools of different species composition.Fig. 2.—


Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics.

Linard B, Crampton-Platt A, Gillett CP, Timmermans MJ, Vogler AP - Genome Biol Evol (2015)

Phylogenetic content of the metagenomes. (A) Circle intersections of scaffold overlap among the different libraries. (B) Classification of scaffolds based on best hits in genome databases. See text for details of the five categories. The y axis represents the absolute number of scaffolds. Circle size represents category proportion relative to the total number of scaffolds in a library. (C) Circle intersections of scaffold overlap among the different libraries for scaffolds assigned to Hexapoda.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4494052&req=5

evv086-F2: Phylogenetic content of the metagenomes. (A) Circle intersections of scaffold overlap among the different libraries. (B) Classification of scaffolds based on best hits in genome databases. See text for details of the five categories. The y axis represents the absolute number of scaffolds. Circle size represents category proportion relative to the total number of scaffolds in a library. (C) Circle intersections of scaffold overlap among the different libraries for scaffolds assigned to Hexapoda.
Mentions: The Truseq libraries (Weevil, Canopy_Long, Canopy_Short) produced 17.3–23.9 M reads pairs and the Nextera library (Canopy_Next) produced 7.3 M reads. Following trimming, 30% of reads were discarded in the three Canopy libraries and 5% in the Weevil library (table 1). Assembly of the four Illumina libraries each produced between 20,000 and nearly 100,000 contigs and numbers were only slightly lower for (noncontiguous) scaffolds (table 1). Using the same DNA pool, both TruSeq libraries resulted in more than twice the number of reads as the Nextera library, and Canopy_Long assembled almost twice as many contigs and scaffolds as Canopy_Short and over three times as many as Canopy_Next. The Weevil pool produced the largest number of scaffolds despite containing the second lowest number of reads, whereby long insert size and greater homogeneity of read numbers from equimolar DNA samples may have aided the assembly. We determined intersections of library contents with pairwise alignments of the scaffolds (fig. 2A). The scaffolds of the three Canopy libraries were aligned with a stringent threshold of sequence identity >90%, E < 1e-18, alignment length >250 bp. In total, 19,297 scaffolds were shared by at least two Canopy libraries, and the tripartite intersection showed a core of 6,940 scaffolds (11–35% of the libraries) that was consistently recovered despite the low-coverage sequencing (fig. 2A, left). We performed a similar pairwise alignment between the Weevil library and the scaffold collection of all Canopy libraries (Canopy_merged), with a slightly lower threshold (sequence identity >80%, E < 1e-18, alignment length >250 bp) to recover potential homologs among different species (fig. 2A, right). A total of 5,174 scaffolds were shared by both samples (5.8% of Weevil scaffolds; 4.7% of Canopy scaffolds), showing that thousands of similar scaffolds can also be recovered between pools of different species composition.Fig. 2.—

Bottom Line: In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear.Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts.The "metagenome skimming" approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics.

View Article: PubMed Central - PubMed

Affiliation: Department of Life Sciences, Natural History Museum, London, United Kingdom.

Show MeSH
Related in: MedlinePlus