Limits...
Average genome size estimation improves comparative metagenomics and sheds light on the functional ecology of the human microbiome.

Nayfach S, Pollard KS - Genome Biol. (2015)

Bottom Line: We developed MicrobeCensus to rapidly and accurately estimate average genome size from shotgun metagenomic data and applied our tool to 1,352 human microbiome samples.We found that average genome size differs significantly within and between body sites and tracks with major functional and taxonomic differences.In the gut, average genome size is positively correlated with the abundance of Bacteroides and genes related to carbohydrate metabolism.

View Article: PubMed Central - PubMed

ABSTRACT
Average genome size is an important, yet often overlooked, property of microbial communities. We developed MicrobeCensus to rapidly and accurately estimate average genome size from shotgun metagenomic data and applied our tool to 1,352 human microbiome samples. We found that average genome size differs significantly within and between body sites and tracks with major functional and taxonomic differences. In the gut, average genome size is positively correlated with the abundance of Bacteroides and genes related to carbohydrate metabolism. Importantly, we found that average genome size variation can bias comparative analyses, and that normalization improves detection of differentially abundant genes.

Show MeSH
Flowchart for estimating AGS from a shotgun metagenome. 1) MicrobeCensus takes the first n reads of at least i base pairs from the shotgun metagenome and trims these reads down to i base pairs. 2) These reads are aligned against the database of essential genes using RAPsearch2. 3) A read is mapped to an essential gene family, j, if its top scoring alignment satisfies the mapping parameters, which are optimized for gene j and read length i. 4) Based on these mapped reads, the relative abundance of each essential gene family, Rj, is computed. 5) Next, we use Rj to obtain an estimate of AGS for each gene. 6) Outlier predictions are removed and 7) MicrobeCensus takes a weighted average over the remaining estimates to produce a robust estimate of AGS for the shotgun metagenome. QC, quality control.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4389708&req=5

Fig1: Flowchart for estimating AGS from a shotgun metagenome. 1) MicrobeCensus takes the first n reads of at least i base pairs from the shotgun metagenome and trims these reads down to i base pairs. 2) These reads are aligned against the database of essential genes using RAPsearch2. 3) A read is mapped to an essential gene family, j, if its top scoring alignment satisfies the mapping parameters, which are optimized for gene j and read length i. 4) Based on these mapped reads, the relative abundance of each essential gene family, Rj, is computed. 5) Next, we use Rj to obtain an estimate of AGS for each gene. 6) Outlier predictions are removed and 7) MicrobeCensus takes a weighted average over the remaining estimates to produce a robust estimate of AGS for the shotgun metagenome. QC, quality control.

Mentions: The resulting new method, called MicrobeCensus, rapidly and accurately estimates AGS from metagenomic data (Figure 1). MicrobeCensus first downsamples the first n reads of at least i base pairs from the metagenome, which we found improves computational efficiency without sacrificing accuracy (Additional file 4; Results). Next, these reads are trimmed from their 3′ end down to i bp. This is principally done because our method uses parameters that are read-length specific. Next, these reads are translated and aligned against the database of essential genes using RAPsearch2. Reads are classified into a gene family if their top scoring alignment meets or exceeds the optimal mapping parameters for that gene family and the specified read length. We then obtain an estimate of AGS for each gene family based on that family’s relative abundance and proportionality constant. Finally, MicrobeCensus eliminates any outliers and take a weighted average over the remaining estimates to produce a robust estimate of AGS for the metagenome. We found that the 30 gene families we selected were sufficient to produce accurate estimates of AGS, and additional genes would probably not have significantly improved performance (Additional file 5; Materials and methods).Figure 1


Average genome size estimation improves comparative metagenomics and sheds light on the functional ecology of the human microbiome.

Nayfach S, Pollard KS - Genome Biol. (2015)

Flowchart for estimating AGS from a shotgun metagenome. 1) MicrobeCensus takes the first n reads of at least i base pairs from the shotgun metagenome and trims these reads down to i base pairs. 2) These reads are aligned against the database of essential genes using RAPsearch2. 3) A read is mapped to an essential gene family, j, if its top scoring alignment satisfies the mapping parameters, which are optimized for gene j and read length i. 4) Based on these mapped reads, the relative abundance of each essential gene family, Rj, is computed. 5) Next, we use Rj to obtain an estimate of AGS for each gene. 6) Outlier predictions are removed and 7) MicrobeCensus takes a weighted average over the remaining estimates to produce a robust estimate of AGS for the shotgun metagenome. QC, quality control.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4389708&req=5

Fig1: Flowchart for estimating AGS from a shotgun metagenome. 1) MicrobeCensus takes the first n reads of at least i base pairs from the shotgun metagenome and trims these reads down to i base pairs. 2) These reads are aligned against the database of essential genes using RAPsearch2. 3) A read is mapped to an essential gene family, j, if its top scoring alignment satisfies the mapping parameters, which are optimized for gene j and read length i. 4) Based on these mapped reads, the relative abundance of each essential gene family, Rj, is computed. 5) Next, we use Rj to obtain an estimate of AGS for each gene. 6) Outlier predictions are removed and 7) MicrobeCensus takes a weighted average over the remaining estimates to produce a robust estimate of AGS for the shotgun metagenome. QC, quality control.
Mentions: The resulting new method, called MicrobeCensus, rapidly and accurately estimates AGS from metagenomic data (Figure 1). MicrobeCensus first downsamples the first n reads of at least i base pairs from the metagenome, which we found improves computational efficiency without sacrificing accuracy (Additional file 4; Results). Next, these reads are trimmed from their 3′ end down to i bp. This is principally done because our method uses parameters that are read-length specific. Next, these reads are translated and aligned against the database of essential genes using RAPsearch2. Reads are classified into a gene family if their top scoring alignment meets or exceeds the optimal mapping parameters for that gene family and the specified read length. We then obtain an estimate of AGS for each gene family based on that family’s relative abundance and proportionality constant. Finally, MicrobeCensus eliminates any outliers and take a weighted average over the remaining estimates to produce a robust estimate of AGS for the metagenome. We found that the 30 gene families we selected were sufficient to produce accurate estimates of AGS, and additional genes would probably not have significantly improved performance (Additional file 5; Materials and methods).Figure 1

Bottom Line: We developed MicrobeCensus to rapidly and accurately estimate average genome size from shotgun metagenomic data and applied our tool to 1,352 human microbiome samples.We found that average genome size differs significantly within and between body sites and tracks with major functional and taxonomic differences.In the gut, average genome size is positively correlated with the abundance of Bacteroides and genes related to carbohydrate metabolism.

View Article: PubMed Central - PubMed

ABSTRACT
Average genome size is an important, yet often overlooked, property of microbial communities. We developed MicrobeCensus to rapidly and accurately estimate average genome size from shotgun metagenomic data and applied our tool to 1,352 human microbiome samples. We found that average genome size differs significantly within and between body sites and tracks with major functional and taxonomic differences. In the gut, average genome size is positively correlated with the abundance of Bacteroides and genes related to carbohydrate metabolism. Importantly, we found that average genome size variation can bias comparative analyses, and that normalization improves detection of differentially abundant genes.

Show MeSH