Limits...
MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities.

Kang DD, Froula J, Egan R, Wang Z - PeerJ (2015)

Bottom Line: In addition, most of the tools are not scalable to large datasets.Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning.It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Energy Joint Genome Institute , Walnut Creek, CA , USA ; Genomics Division, Lawrence Berkeley National Laboratory , Berkeley, CA , USA.

ABSTRACT
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.

No MeSH data available.


Comparison between MetaBAT bins after post-processing and MGS draft genomes from Nielsen et al.(A) Venn diagram of identified genome bins by MetaBAT having >90% precision and >30% completeness calculated by CheckM and one-to-one corresponding genomes in MGS draft genomes. (B) Scatterplot of completeness and precision for MetaBAT genome bins when considered MGS draft genomes as the gold standard. X-axis represents shared proportion of bases in terms of MetaBAT bins (i.e., precision), and y-axis represents shared proportion of bases in terms of MGS genomes (i.e., completeness). Each circle represents a unique MetaBAT bins having uniquely corresponding MGS genomes (342 bins in total), and the size of it corresponds to bin size.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4556158&req=5

fig-5: Comparison between MetaBAT bins after post-processing and MGS draft genomes from Nielsen et al.(A) Venn diagram of identified genome bins by MetaBAT having >90% precision and >30% completeness calculated by CheckM and one-to-one corresponding genomes in MGS draft genomes. (B) Scatterplot of completeness and precision for MetaBAT genome bins when considered MGS draft genomes as the gold standard. X-axis represents shared proportion of bases in terms of MetaBAT bins (i.e., precision), and y-axis represents shared proportion of bases in terms of MGS genomes (i.e., completeness). Each circle represents a unique MetaBAT bins having uniquely corresponding MGS genomes (342 bins in total), and the size of it corresponds to bin size.

Mentions: By incorporating additional sequencing data and other post-binning optimizations, Nielsen et al. (2014) generated 373 high quality draft genomes (“MGS genomes”). We therefore used these MGS draft genomes as reference for additional quality assessment of the MetaBAT genome bins after post-processing. As shown in Fig. 5A, 31 MGS draft genomes were not well represented by MetaBAT bins, but MetaBAT recovered 55 additional genome bins not reported by MGS draft genomes. For those overlapping bins, most MetaBAT bins closely approximate the MGS draft genomes in accuracy—94% precision and 82% completeness (Fig. 5B).


MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities.

Kang DD, Froula J, Egan R, Wang Z - PeerJ (2015)

Comparison between MetaBAT bins after post-processing and MGS draft genomes from Nielsen et al.(A) Venn diagram of identified genome bins by MetaBAT having >90% precision and >30% completeness calculated by CheckM and one-to-one corresponding genomes in MGS draft genomes. (B) Scatterplot of completeness and precision for MetaBAT genome bins when considered MGS draft genomes as the gold standard. X-axis represents shared proportion of bases in terms of MetaBAT bins (i.e., precision), and y-axis represents shared proportion of bases in terms of MGS genomes (i.e., completeness). Each circle represents a unique MetaBAT bins having uniquely corresponding MGS genomes (342 bins in total), and the size of it corresponds to bin size.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4556158&req=5

fig-5: Comparison between MetaBAT bins after post-processing and MGS draft genomes from Nielsen et al.(A) Venn diagram of identified genome bins by MetaBAT having >90% precision and >30% completeness calculated by CheckM and one-to-one corresponding genomes in MGS draft genomes. (B) Scatterplot of completeness and precision for MetaBAT genome bins when considered MGS draft genomes as the gold standard. X-axis represents shared proportion of bases in terms of MetaBAT bins (i.e., precision), and y-axis represents shared proportion of bases in terms of MGS genomes (i.e., completeness). Each circle represents a unique MetaBAT bins having uniquely corresponding MGS genomes (342 bins in total), and the size of it corresponds to bin size.
Mentions: By incorporating additional sequencing data and other post-binning optimizations, Nielsen et al. (2014) generated 373 high quality draft genomes (“MGS genomes”). We therefore used these MGS draft genomes as reference for additional quality assessment of the MetaBAT genome bins after post-processing. As shown in Fig. 5A, 31 MGS draft genomes were not well represented by MetaBAT bins, but MetaBAT recovered 55 additional genome bins not reported by MGS draft genomes. For those overlapping bins, most MetaBAT bins closely approximate the MGS draft genomes in accuracy—94% precision and 82% completeness (Fig. 5B).

Bottom Line: In addition, most of the tools are not scalable to large datasets.Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning.It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Energy Joint Genome Institute , Walnut Creek, CA , USA ; Genomics Division, Lawrence Berkeley National Laboratory , Berkeley, CA , USA.

ABSTRACT
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.

No MeSH data available.