Limits...
MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities.

Kang DD, Froula J, Egan R, Wang Z - PeerJ (2015)

Bottom Line: In addition, most of the tools are not scalable to large datasets.Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning.It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Energy Joint Genome Institute , Walnut Creek, CA , USA ; Genomics Division, Lawrence Berkeley National Laboratory , Berkeley, CA , USA.

ABSTRACT
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.

No MeSH data available.


Binning performance on real metagenomic assemblies.(A) The number of genomes (X-axis) identified by each binning method (Y-axis) in different recall (completeness) threshold and >90% precision, which calculates the lack of contamination. (B) Venn diagram of identified genomes by top 4 binning methods.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4556158&req=5

fig-4: Binning performance on real metagenomic assemblies.(A) The number of genomes (X-axis) identified by each binning method (Y-axis) in different recall (completeness) threshold and >90% precision, which calculates the lack of contamination. (B) Venn diagram of identified genomes by top 4 binning methods.

Mentions: Similar to the previous “error-free” experiment, MetaBAT again identified the greatest number of unique genome bins having >90% precision (Fig. 4A). In this experiment with real metagenomic contigs, the superior completeness we saw in CONCOCT during the “error-free” experiment was lost. Moreover, the number of genome bins formed by MetaBAT was consistently greater than the others at every completeness threshold. Similarly, different tools produced complementary binning results as before (Fig. 4B). MetaBAT’s contribution appears to be more pronounced this time. It missed 17 bins formed by all other tools combined, but recovered 31 bins that no other tools produced. MetaBAT alone recovered 90.2% (133/144) of genome bins from all tools. These results suggest MetaBAT is very robust when run against a real metagenome assembly. Consistent with the simulation experiment, MetaBAT is computationally very efficient and requires only 4 min to complete this experiment (Table 2).


MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities.

Kang DD, Froula J, Egan R, Wang Z - PeerJ (2015)

Binning performance on real metagenomic assemblies.(A) The number of genomes (X-axis) identified by each binning method (Y-axis) in different recall (completeness) threshold and >90% precision, which calculates the lack of contamination. (B) Venn diagram of identified genomes by top 4 binning methods.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4556158&req=5

fig-4: Binning performance on real metagenomic assemblies.(A) The number of genomes (X-axis) identified by each binning method (Y-axis) in different recall (completeness) threshold and >90% precision, which calculates the lack of contamination. (B) Venn diagram of identified genomes by top 4 binning methods.
Mentions: Similar to the previous “error-free” experiment, MetaBAT again identified the greatest number of unique genome bins having >90% precision (Fig. 4A). In this experiment with real metagenomic contigs, the superior completeness we saw in CONCOCT during the “error-free” experiment was lost. Moreover, the number of genome bins formed by MetaBAT was consistently greater than the others at every completeness threshold. Similarly, different tools produced complementary binning results as before (Fig. 4B). MetaBAT’s contribution appears to be more pronounced this time. It missed 17 bins formed by all other tools combined, but recovered 31 bins that no other tools produced. MetaBAT alone recovered 90.2% (133/144) of genome bins from all tools. These results suggest MetaBAT is very robust when run against a real metagenome assembly. Consistent with the simulation experiment, MetaBAT is computationally very efficient and requires only 4 min to complete this experiment (Table 2).

Bottom Line: In addition, most of the tools are not scalable to large datasets.Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning.It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Energy Joint Genome Institute , Walnut Creek, CA , USA ; Genomics Division, Lawrence Berkeley National Laboratory , Berkeley, CA , USA.

ABSTRACT
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.

No MeSH data available.