Limits...
Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.

Pongor LS, Vera R, Ligeti B - PLoS ONE (2014)

Bottom Line: We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers.Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database.Taxoner is written in C for Linux operating systems.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Information Technology, Pázmány Péter Catholic University, Budapest, Hungary; 2nd Department of Pediatrics, Semmelweis University, Budapest, Hungary.

ABSTRACT
Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner.

Show MeSH

Related in: MedlinePlus

The Taxoner principle.Reads are mapped to genomes and the corresponding taxon names are read from an ontology, in this case a taxonomic tree. For function analysis, the name of the mapped gene is read from an ontology of function names such as GO.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4117525&req=5

pone-0103441-g001: The Taxoner principle.Reads are mapped to genomes and the corresponding taxon names are read from an ontology, in this case a taxonomic tree. For function analysis, the name of the mapped gene is read from an ontology of function names such as GO.

Mentions: A typical output is shown in Figure 1, with the reads indexed according to the NCBI taxonomy. Mapping reads to genes is provided by the Taxoner gene assignment module (see methods), which uses a pre-built dataset. This process typically takes more time than the alignment. Its time requirements are not included in Table 1. Typical results are shown in Figure 2 (A: list of functions, B: Bar diagram).


Fast and sensitive alignment of microbial whole genome sequencing reads to large sequence datasets on a desktop PC: application to metagenomic datasets and pathogen identification.

Pongor LS, Vera R, Ligeti B - PLoS ONE (2014)

The Taxoner principle.Reads are mapped to genomes and the corresponding taxon names are read from an ontology, in this case a taxonomic tree. For function analysis, the name of the mapped gene is read from an ontology of function names such as GO.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4117525&req=5

pone-0103441-g001: The Taxoner principle.Reads are mapped to genomes and the corresponding taxon names are read from an ontology, in this case a taxonomic tree. For function analysis, the name of the mapped gene is read from an ontology of function names such as GO.
Mentions: A typical output is shown in Figure 1, with the reads indexed according to the NCBI taxonomy. Mapping reads to genes is provided by the Taxoner gene assignment module (see methods), which uses a pre-built dataset. This process typically takes more time than the alignment. Its time requirements are not included in Table 1. Typical results are shown in Figure 2 (A: list of functions, B: Bar diagram).

Bottom Line: We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers.Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database.Taxoner is written in C for Linux operating systems.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Information Technology, Pázmány Péter Catholic University, Budapest, Hungary; 2nd Department of Pediatrics, Semmelweis University, Budapest, Hungary.

ABSTRACT
Next generation sequencing (NGS) of metagenomic samples is becoming a standard approach to detect individual species or pathogenic strains of microorganisms. Computer programs used in the NGS community have to balance between speed and sensitivity and as a result, species or strain level identification is often inaccurate and low abundance pathogens can sometimes be missed. We have developed Taxoner, an open source, taxon assignment pipeline that includes a fast aligner (e.g. Bowtie2) and a comprehensive DNA sequence database. We tested the program on simulated datasets as well as experimental data from Illumina, IonTorrent, and Roche 454 sequencing platforms. We found that Taxoner performs as well as, and often better than BLAST, but requires two orders of magnitude less running time meaning that it can be run on desktop or laptop computers. Taxoner is slower than the approaches that use small marker databases but is more sensitive due the comprehensive reference database. In addition, it can be easily tuned to specific applications using small tailored databases. When applied to metagenomic datasets, Taxoner can provide a functional summary of the genes mapped and can provide strain level identification. Taxoner is written in C for Linux operating systems. The code and documentation are available for research applications at http://code.google.com/p/taxoner.

Show MeSH
Related in: MedlinePlus