Limits...
BayesHammer: Bayesian clustering for error correction in single-cell sequencing.

Nikolenko SI, Korobeynikov AI, Alekseyev MA - BMC Genomics (2013)

Bottom Line: While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic.We introduce several novel algorithms based on Hamming graphs and Bayesian subclustering in our new error correction tool BAYESHAMMER.While BAYESHAMMER was designed for single-cell sequencing, we demonstrate that it also improves on existing error correction tools for multi-cell sequencing data while working much faster on real-life datasets.We benchmark BAYESHAMMER on both k-mer counts and actual assembly results with the SPADES genome assembler.

View Article: PubMed Central - HTML - PubMed

Affiliation: Algorithmic Biology Laboratory, Academic University, St, Petersburg, Russia. sergey@logic.pdmi.ras.ru

ABSTRACT
Error correction of sequenced reads remains a difficult task, especially in single-cell sequencing projects with extremely non-uniform coverage. While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic.We introduce several novel algorithms based on Hamming graphs and Bayesian subclustering in our new error correction tool BAYESHAMMER. While BAYESHAMMER was designed for single-cell sequencing, we demonstrate that it also improves on existing error correction tools for multi-cell sequencing data while working much faster on real-life datasets. We benchmark BAYESHAMMER on both k-mer counts and actual assembly results with the SPADES genome assembler.

Show MeSH
Logarithmic coverage plot for the single-cell E. coli dataset. Logarithmic coverage plot for the single-cell E. coli dataset (similar plot is also given in [2]).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3549815&req=5

Figure 1: Logarithmic coverage plot for the single-cell E. coli dataset. Logarithmic coverage plot for the single-cell E. coli dataset (similar plot is also given in [2]).

Mentions: Single-cell sequencing datasets have extremely non-uniform coverage that may vary from ones to thousands along a single genome (Figure 1). For many existing error correction tools, most notably QUAKE [7], uniform coverage is a prerequisite: in the case of non-uniform coverage they either do not work or produce poor results.


BayesHammer: Bayesian clustering for error correction in single-cell sequencing.

Nikolenko SI, Korobeynikov AI, Alekseyev MA - BMC Genomics (2013)

Logarithmic coverage plot for the single-cell E. coli dataset. Logarithmic coverage plot for the single-cell E. coli dataset (similar plot is also given in [2]).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3549815&req=5

Figure 1: Logarithmic coverage plot for the single-cell E. coli dataset. Logarithmic coverage plot for the single-cell E. coli dataset (similar plot is also given in [2]).
Mentions: Single-cell sequencing datasets have extremely non-uniform coverage that may vary from ones to thousands along a single genome (Figure 1). For many existing error correction tools, most notably QUAKE [7], uniform coverage is a prerequisite: in the case of non-uniform coverage they either do not work or produce poor results.

Bottom Line: While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic.We introduce several novel algorithms based on Hamming graphs and Bayesian subclustering in our new error correction tool BAYESHAMMER.While BAYESHAMMER was designed for single-cell sequencing, we demonstrate that it also improves on existing error correction tools for multi-cell sequencing data while working much faster on real-life datasets.We benchmark BAYESHAMMER on both k-mer counts and actual assembly results with the SPADES genome assembler.

View Article: PubMed Central - HTML - PubMed

Affiliation: Algorithmic Biology Laboratory, Academic University, St, Petersburg, Russia. sergey@logic.pdmi.ras.ru

ABSTRACT
Error correction of sequenced reads remains a difficult task, especially in single-cell sequencing projects with extremely non-uniform coverage. While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic.We introduce several novel algorithms based on Hamming graphs and Bayesian subclustering in our new error correction tool BAYESHAMMER. While BAYESHAMMER was designed for single-cell sequencing, we demonstrate that it also improves on existing error correction tools for multi-cell sequencing data while working much faster on real-life datasets. We benchmark BAYESHAMMER on both k-mer counts and actual assembly results with the SPADES genome assembler.

Show MeSH