Limits...
Improving pan-genome annotation using whole genome multiple alignment.

Angiuoli SV, Dunning Hotopp JC, Salzberg SL, Tettelin H - BMC Bioinformatics (2011)

Bottom Line: Whole genome multiple alignment can be used to efficiently identify orthologs and annotation problem areas in a bacterial pan-genome.Comparisons of annotated gene structures within a species may show more variation than is actually present in the genome, indicating errors in genome annotation.Our new tool Mugsy-Annotator assists re-annotation efforts by highlighting edits that improve annotation consistency.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA. angiuoli@umiacs.umd.edu

Show MeSH

Related in: MedlinePlus

Annotation anomalies caused by a single genome. Each row provides a count of ortholog groups where the named genome is inconsistent with the remaining genomes in the group. In these cases, the annotated translation initiation site in the named genome in Nmen verB did not match any of the other annotated gene structures in the ortholog groups.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3142524&req=5

Figure 6: Annotation anomalies caused by a single genome. Each row provides a count of ortholog groups where the named genome is inconsistent with the remaining genomes in the group. In these cases, the annotated translation initiation site in the named genome in Nmen verB did not match any of the other annotated gene structures in the ortholog groups.

Mentions: As a case study, we evaluated the Mugsy-Annotator report for the dataset of 20 Nmen genomes. Inconsistent TIS are the most commonly detected anomaly in Nmen with 30% of aligned gene sets containing more than one annotated TIS. Due to lack of precision in TIS prediction, we expect the number of TIS inconsistencies to increase as the number of genomes increases, especially since our method marks a group as inconsistent even if the annotation error is limited to a single genome. To see how overall consistency is affected by any single genome, Mugsy-Annotator reports the number of times a single genome is inconsistent in comparison to the set. An examination of the Nmen genomes shows that certain subsets of genomes have better internal consistency. In 27% of groups with TIS inconsistencies, an alternative annotation in a single genome will resolve the inconsistencies for the group (Figure 6). Although some of the Nmen genomes contributed to more annotation inconsistencies than others, all of the genomes contributed to inconsistencies in at least one group.


Improving pan-genome annotation using whole genome multiple alignment.

Angiuoli SV, Dunning Hotopp JC, Salzberg SL, Tettelin H - BMC Bioinformatics (2011)

Annotation anomalies caused by a single genome. Each row provides a count of ortholog groups where the named genome is inconsistent with the remaining genomes in the group. In these cases, the annotated translation initiation site in the named genome in Nmen verB did not match any of the other annotated gene structures in the ortholog groups.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3142524&req=5

Figure 6: Annotation anomalies caused by a single genome. Each row provides a count of ortholog groups where the named genome is inconsistent with the remaining genomes in the group. In these cases, the annotated translation initiation site in the named genome in Nmen verB did not match any of the other annotated gene structures in the ortholog groups.
Mentions: As a case study, we evaluated the Mugsy-Annotator report for the dataset of 20 Nmen genomes. Inconsistent TIS are the most commonly detected anomaly in Nmen with 30% of aligned gene sets containing more than one annotated TIS. Due to lack of precision in TIS prediction, we expect the number of TIS inconsistencies to increase as the number of genomes increases, especially since our method marks a group as inconsistent even if the annotation error is limited to a single genome. To see how overall consistency is affected by any single genome, Mugsy-Annotator reports the number of times a single genome is inconsistent in comparison to the set. An examination of the Nmen genomes shows that certain subsets of genomes have better internal consistency. In 27% of groups with TIS inconsistencies, an alternative annotation in a single genome will resolve the inconsistencies for the group (Figure 6). Although some of the Nmen genomes contributed to more annotation inconsistencies than others, all of the genomes contributed to inconsistencies in at least one group.

Bottom Line: Whole genome multiple alignment can be used to efficiently identify orthologs and annotation problem areas in a bacterial pan-genome.Comparisons of annotated gene structures within a species may show more variation than is actually present in the genome, indicating errors in genome annotation.Our new tool Mugsy-Annotator assists re-annotation efforts by highlighting edits that improve annotation consistency.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA. angiuoli@umiacs.umd.edu

Show MeSH
Related in: MedlinePlus