Limits...
AGeS: a software system for microbial genome sequence annotation.

Kumar K, Desai V, Cheng L, Khitrov M, Grover D, Satya RV, Yu C, Zavaljevski N, Reifman J - PLoS ONE (2011)

Bottom Line: The first is the storage of input contig sequences and the resulting annotation data in a central, customized database.Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods.This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.

View Article: PubMed Central - PubMed

Affiliation: DoD Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of America.

ABSTRACT

Background: The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance.

Methodology: The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.

Show MeSH
Schematic representation of the various tools of the genome                            annotation pipeline.Given assembled contigs in a FASTA format file, processing starts with                            the Do-It-Yourself Annotation (DIYA) genome annotation tool, followed by                            post-processing, tandem repeat annotation, and protein function                            prediction with Pipeline for Protein Annotation (PIPA).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3049762&req=5

pone-0017469-g002: Schematic representation of the various tools of the genome annotation pipeline.Given assembled contigs in a FASTA format file, processing starts with the Do-It-Yourself Annotation (DIYA) genome annotation tool, followed by post-processing, tandem repeat annotation, and protein function prediction with Pipeline for Protein Annotation (PIPA).

Mentions: As shown in Figure 2, the annotation pipeline takes as input assembled contiguous sequences, or contigs, in FASTA format files generated by high-throughput sequencing technologies [25]–[27]. AGeS uses the DIYA framework [2] to analyze input contigs. Contigs are first concatenated to create a continuous sequence, or pseudo-assembly, where a sequence of 18 bp consisting of 6 frame translational stop codons is used for filling the space between adjacent contigs.


AGeS: a software system for microbial genome sequence annotation.

Kumar K, Desai V, Cheng L, Khitrov M, Grover D, Satya RV, Yu C, Zavaljevski N, Reifman J - PLoS ONE (2011)

Schematic representation of the various tools of the genome                            annotation pipeline.Given assembled contigs in a FASTA format file, processing starts with                            the Do-It-Yourself Annotation (DIYA) genome annotation tool, followed by                            post-processing, tandem repeat annotation, and protein function                            prediction with Pipeline for Protein Annotation (PIPA).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3049762&req=5

pone-0017469-g002: Schematic representation of the various tools of the genome annotation pipeline.Given assembled contigs in a FASTA format file, processing starts with the Do-It-Yourself Annotation (DIYA) genome annotation tool, followed by post-processing, tandem repeat annotation, and protein function prediction with Pipeline for Protein Annotation (PIPA).
Mentions: As shown in Figure 2, the annotation pipeline takes as input assembled contiguous sequences, or contigs, in FASTA format files generated by high-throughput sequencing technologies [25]–[27]. AGeS uses the DIYA framework [2] to analyze input contigs. Contigs are first concatenated to create a continuous sequence, or pseudo-assembly, where a sequence of 18 bp consisting of 6 frame translational stop codons is used for filling the space between adjacent contigs.

Bottom Line: The first is the storage of input contig sequences and the resulting annotation data in a central, customized database.Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods.This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.

View Article: PubMed Central - PubMed

Affiliation: DoD Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Ft. Detrick, Maryland, United States of America.

ABSTRACT

Background: The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance.

Methodology: The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.

Show MeSH