Limits...
Fast and sensitive mapping of nanopore sequencing reads with GraphMap.

Sović I, Šikić M, Wilm A, Fenlon SN, Chen S, Nagarajan N - Nat Commun (2016)

Bottom Line: Realizing the democratic promise of nanopore sequencing requires the development of new bioinformatics approaches to deal with its specific error characteristics.Here we present GraphMap, a mapping algorithm designed to analyse nanopore sequencing reads, which progressively refines candidate alignments to robustly handle potentially high-error rates and a fast graph traversal to align long reads with speed and high precision (>95%).GraphMap alignments enabled single-nucleotide variant calling on the human genome with increased sensitivity (15%) over the next best mapper, precise detection of structural variants from length 100 bp to 4 kbp, and species and strain-specific identification of pathogens using MinION reads.

View Article: PubMed Central - PubMed

Affiliation: Computational &Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, #02-01 Genome, Singapore 138672, Singapore.

ABSTRACT
Realizing the democratic promise of nanopore sequencing requires the development of new bioinformatics approaches to deal with its specific error characteristics. Here we present GraphMap, a mapping algorithm designed to analyse nanopore sequencing reads, which progressively refines candidate alignments to robustly handle potentially high-error rates and a fast graph traversal to align long reads with speed and high precision (>95%). Evaluation on MinION sequencing data sets against short- and long-read mappers indicates that GraphMap increases mapping sensitivity by 10-80% and maps >95% of bases. GraphMap alignments enabled single-nucleotide variant calling on the human genome with increased sensitivity (15%) over the next best mapper, precise detection of structural variants from length 100 bp to 4 kbp, and species and strain-specific identification of pathogens using MinION reads. GraphMap is available open source under the MIT license at https://github.com/isovic/graphmap.

No MeSH data available.


Related in: MedlinePlus

Evaluating GraphMap's precision and recall on synthetic ONT data.(a) GraphMap (shaded bars) performance in comparison to BLAST (solid bars) on ONT 2D and 1D reads. Genomes are ordered horizontally by genome size from smallest to largest. For each data set, the graph on the left shows performance for determining the correct mapping location (within 50 bp; y axis on the left) and the one on the right shows performance for the correct alignment of bases (y axis on the right; Methods section). (b) Precision and recall for determining the correct mapping location (within 50 bp) for various mappers on synthetic ONT 1D reads.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4835549&req=5

f2: Evaluating GraphMap's precision and recall on synthetic ONT data.(a) GraphMap (shaded bars) performance in comparison to BLAST (solid bars) on ONT 2D and 1D reads. Genomes are ordered horizontally by genome size from smallest to largest. For each data set, the graph on the left shows performance for determining the correct mapping location (within 50 bp; y axis on the left) and the one on the right shows performance for the correct alignment of bases (y axis on the right; Methods section). (b) Precision and recall for determining the correct mapping location (within 50 bp) for various mappers on synthetic ONT 1D reads.

Mentions: GraphMap was designed to be efficient while being largely agnostic of error profiles and rates. To evaluate this feature a wide range of synthetic data sets were generated that capture the diversity of sequencing technologies (Illumina, PacBio, ONT 2D, ONT 1D) and the complexity of different genomes (Fig. 2, Supplementary Fig. 1a). GraphMap's precision and recall was then measured in terms of identifying the correct read location and in reconstructing the correct alignment to the reference (Methods section). These were evaluated separately as, in principle, a mapper can identify the correct location but compute an incorrect alignment of the read to the reference. To provide for a gold-standard to compare against, BLAST (ref. 16) was used as a representative of a highly sensitive but slow aligner which is sequencing technology agnostic. On synthetic Illumina and PacBio data, GraphMap's results were found to be comparable to BLAST (Supplementary Note 1) as well as other mappers (Supplementary Data 1). On synthetic ONT data, we noted slight differences (<3%) between BLAST and GraphMap, but notably, GraphMap improved over BLAST in finding the right mapping location in some cases (for example, for N. meningitidis ONT 1D data; Fig. 2a). GraphMap's precision and recall in selecting the correct mapping location were consistently >94%, even with high-error rates in the simulated data. Unlike other mappers, GraphMap's results were obtained without tuning parameters to the specifics of the sequencing technology.


Fast and sensitive mapping of nanopore sequencing reads with GraphMap.

Sović I, Šikić M, Wilm A, Fenlon SN, Chen S, Nagarajan N - Nat Commun (2016)

Evaluating GraphMap's precision and recall on synthetic ONT data.(a) GraphMap (shaded bars) performance in comparison to BLAST (solid bars) on ONT 2D and 1D reads. Genomes are ordered horizontally by genome size from smallest to largest. For each data set, the graph on the left shows performance for determining the correct mapping location (within 50 bp; y axis on the left) and the one on the right shows performance for the correct alignment of bases (y axis on the right; Methods section). (b) Precision and recall for determining the correct mapping location (within 50 bp) for various mappers on synthetic ONT 1D reads.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4835549&req=5

f2: Evaluating GraphMap's precision and recall on synthetic ONT data.(a) GraphMap (shaded bars) performance in comparison to BLAST (solid bars) on ONT 2D and 1D reads. Genomes are ordered horizontally by genome size from smallest to largest. For each data set, the graph on the left shows performance for determining the correct mapping location (within 50 bp; y axis on the left) and the one on the right shows performance for the correct alignment of bases (y axis on the right; Methods section). (b) Precision and recall for determining the correct mapping location (within 50 bp) for various mappers on synthetic ONT 1D reads.
Mentions: GraphMap was designed to be efficient while being largely agnostic of error profiles and rates. To evaluate this feature a wide range of synthetic data sets were generated that capture the diversity of sequencing technologies (Illumina, PacBio, ONT 2D, ONT 1D) and the complexity of different genomes (Fig. 2, Supplementary Fig. 1a). GraphMap's precision and recall was then measured in terms of identifying the correct read location and in reconstructing the correct alignment to the reference (Methods section). These were evaluated separately as, in principle, a mapper can identify the correct location but compute an incorrect alignment of the read to the reference. To provide for a gold-standard to compare against, BLAST (ref. 16) was used as a representative of a highly sensitive but slow aligner which is sequencing technology agnostic. On synthetic Illumina and PacBio data, GraphMap's results were found to be comparable to BLAST (Supplementary Note 1) as well as other mappers (Supplementary Data 1). On synthetic ONT data, we noted slight differences (<3%) between BLAST and GraphMap, but notably, GraphMap improved over BLAST in finding the right mapping location in some cases (for example, for N. meningitidis ONT 1D data; Fig. 2a). GraphMap's precision and recall in selecting the correct mapping location were consistently >94%, even with high-error rates in the simulated data. Unlike other mappers, GraphMap's results were obtained without tuning parameters to the specifics of the sequencing technology.

Bottom Line: Realizing the democratic promise of nanopore sequencing requires the development of new bioinformatics approaches to deal with its specific error characteristics.Here we present GraphMap, a mapping algorithm designed to analyse nanopore sequencing reads, which progressively refines candidate alignments to robustly handle potentially high-error rates and a fast graph traversal to align long reads with speed and high precision (>95%).GraphMap alignments enabled single-nucleotide variant calling on the human genome with increased sensitivity (15%) over the next best mapper, precise detection of structural variants from length 100 bp to 4 kbp, and species and strain-specific identification of pathogens using MinION reads.

View Article: PubMed Central - PubMed

Affiliation: Computational &Systems Biology, Genome Institute of Singapore, 60 Biopolis Street, #02-01 Genome, Singapore 138672, Singapore.

ABSTRACT
Realizing the democratic promise of nanopore sequencing requires the development of new bioinformatics approaches to deal with its specific error characteristics. Here we present GraphMap, a mapping algorithm designed to analyse nanopore sequencing reads, which progressively refines candidate alignments to robustly handle potentially high-error rates and a fast graph traversal to align long reads with speed and high precision (>95%). Evaluation on MinION sequencing data sets against short- and long-read mappers indicates that GraphMap increases mapping sensitivity by 10-80% and maps >95% of bases. GraphMap alignments enabled single-nucleotide variant calling on the human genome with increased sensitivity (15%) over the next best mapper, precise detection of structural variants from length 100 bp to 4 kbp, and species and strain-specific identification of pathogens using MinION reads. GraphMap is available open source under the MIT license at https://github.com/isovic/graphmap.

No MeSH data available.


Related in: MedlinePlus