Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics.
Bottom Line: In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear.Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts.The "metagenome skimming" approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics.
Affiliation: Department of Life Sciences, Natural History Museum, London, United Kingdom.Show MeSH
Related in: MedlinePlus
Mentions: Here, we assessed what kind of genomic information can be extracted from low-coverage metagenome sequencing of two specimen pools that were originally generated to address questions about taxonomic (Gillett et al. 2014) and ecological diversity (Crampton-Platt et al. 2015). These existing analyses were performed on the mtDNA fraction of the sequence data only (“mitochondrial metagenomics”; Crampton-Platt et al. 2015), but the much greater nuclear portion of the sequence data was ignored in these studies. It is interrogated here to obtain insights into the genomic diversity of Coleoptera. High-abundance reads producing the scaffolds in MGS are either derived from orthologous loci conserved among multiple genomes, or they are derived from paralogous copies, for example, from repeat elements present in high-copy numbers (hcn) within a genome, but they may also arise from a combination of orthologous and paralogous sequences (fig. 1). Short shotgun reads therefore produce a mixture of assembled contigs but their composition may be a largely random outcome of an idiosyncratic assembly process or the chance composition of the pool of reads. As a first step toward the characterization of the metagenomes, we establish if scaffolds are encountered consistently and at what sequencing depth, to identify the recognizable high copy fraction obtained from pools of particular phyletic composition. Next, we attempted to annotate the resulting scaffolds against existing databases, including collections of known repeats, and identify potential conserved coding regions, such as gene families and tandemly repeated genes. Mapping of scaffolds against the two available reference genomes can further provide information on the intragenomic organization and their intergenomic distribution across evolutionary lineages. Vice versa, the number and distribution of scaffolds mapped against full genome sequences can contribute a new approach to comparative genomics, and specifically to the analysis of the repetitive fraction that is notoriously difficult to characterize with standard genome sequencing methods. Finally, the scaffolds may represent the associated fauna and flora, including the microbiome and potential food sources, which provide information on the wider ecosystem in which the specimens partake.Fig. 1.—
Affiliation: Department of Life Sciences, Natural History Museum, London, United Kingdom.