Limits...
Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae.

Meng S, Brown DE, Ebbole DJ, Torto-Alalibo T, Oh YY, Deng J, Mitchell TK, Dean RA - BMC Microbiol. (2009)

Bottom Line: Integration of the two data sets resulted in 7,412 proteins (57%) being annotated with 1,957 distinct and specific GO terms.Unannotated proteins were assigned to the 3 root terms.Our analysis provides comprehensive and robust GO annotations of the M. oryzae genome assemblies that will be solid foundations for further functional interrogation of M. oryzae.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Integrated Fungal Research, North Carolina State University, Raleigh NC 27695, USA. mengs@med.unc.edu

ABSTRACT

Background: Magnaporthe oryzae, the causal agent of blast disease of rice, is the most destructive disease of rice worldwide. The genome of this fungal pathogen has been sequenced and an automated annotation has recently been updated to Version 6 http://www.broad.mit.edu/annotation/genome/magnaporthe_grisea/MultiDownloads.html. However, a comprehensive manual curation remains to be performed. Gene Ontology (GO) annotation is a valuable means of assigning functional information using standardized vocabulary. We report an overview of the GO annotation for Version 5 of M. oryzae genome assembly.

Methods: A similarity-based (i.e., computational) GO annotation with manual review was conducted, which was then integrated with a literature-based GO annotation with computational assistance. For similarity-based GO annotation a stringent reciprocal best hits method was used to identify similarity between predicted proteins of M. oryzae and GO proteins from multiple organisms with published associations to GO terms. Significant alignment pairs were manually reviewed. Functional assignments were further cross-validated with manually reviewed data, conserved domains, or data determined by wet lab experiments. Additionally, biological appropriateness of the functional assignments was manually checked.

Results: In total, 6,286 proteins received GO term assignment via the homology-based annotation, including 2,870 hypothetical proteins. Literature-based experimental evidence, such as microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation, resulted in 2,810 proteins being annotated with GO terms. Of these, 1,673 proteins were annotated with new terms developed for Plant-Associated Microbe Gene Ontology (PAMGO). In addition, 67 experiment-determined secreted proteins were annotated with PAMGO terms. Integration of the two data sets resulted in 7,412 proteins (57%) being annotated with 1,957 distinct and specific GO terms. Unannotated proteins were assigned to the 3 root terms. The Version 5 GO annotation is publically queryable via the GO site http://amigo.geneontology.org/cgi-bin/amigo/go.cgi. Additionally, the genome of M. oryzae is constantly being refined and updated as new information is incorporated. For the latest GO annotation of Version 6 genome, please visit our website http://scotland.fgl.ncsu.edu/smeng/GoAnnotationMagnaporthegrisea.html. The preliminary GO annotation of Version 6 genome is placed at a local MySql database that is publically queryable via a user-friendly interface Adhoc Query System.

Conclusion: Our analysis provides comprehensive and robust GO annotations of the M. oryzae genome assemblies that will be solid foundations for further functional interrogation of M. oryzae.

Show MeSH

Related in: MedlinePlus

Features of reciprocal best BLASTP matches between GO-annotated proteins and predicted proteins of Magnaporthe oryzae. The vast majority of the matches to characterized proteins have high sequence identity over much of their length. Shaded grey bars indicate matches with a percentage of identity (pid) ≥ 40%, and shaded black bars indicate pid < 40%.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2654668&req=5

Figure 1: Features of reciprocal best BLASTP matches between GO-annotated proteins and predicted proteins of Magnaporthe oryzae. The vast majority of the matches to characterized proteins have high sequence identity over much of their length. Shaded grey bars indicate matches with a percentage of identity (pid) ≥ 40%, and shaded black bars indicate pid < 40%.

Mentions: From the initial BLASTP analysis for reciprocal best hits, 6,286 (49% of the 12,832) predicted proteins were annotated with 1,911 distinct and specific GO terms out of a total of 29,126 assigned terms. Totally, 4,881 (78%) of the 6,286 proteins were considered to be significant matches to characterized GO proteins, with an E-value < 10-20 and percentage of identity (pid) ≥ 40%. Furthermore, 4,535 (93%) of the 4,881 proteins were annotated based on highly significant similarities with E-values = 0 and pid ≥ 40% (see Figure 1 for details). The pairwise alignments of these significant matches were manually reviewed. Additionally, these high quality matches were cross-validated as follows:


Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae.

Meng S, Brown DE, Ebbole DJ, Torto-Alalibo T, Oh YY, Deng J, Mitchell TK, Dean RA - BMC Microbiol. (2009)

Features of reciprocal best BLASTP matches between GO-annotated proteins and predicted proteins of Magnaporthe oryzae. The vast majority of the matches to characterized proteins have high sequence identity over much of their length. Shaded grey bars indicate matches with a percentage of identity (pid) ≥ 40%, and shaded black bars indicate pid < 40%.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2654668&req=5

Figure 1: Features of reciprocal best BLASTP matches between GO-annotated proteins and predicted proteins of Magnaporthe oryzae. The vast majority of the matches to characterized proteins have high sequence identity over much of their length. Shaded grey bars indicate matches with a percentage of identity (pid) ≥ 40%, and shaded black bars indicate pid < 40%.
Mentions: From the initial BLASTP analysis for reciprocal best hits, 6,286 (49% of the 12,832) predicted proteins were annotated with 1,911 distinct and specific GO terms out of a total of 29,126 assigned terms. Totally, 4,881 (78%) of the 6,286 proteins were considered to be significant matches to characterized GO proteins, with an E-value < 10-20 and percentage of identity (pid) ≥ 40%. Furthermore, 4,535 (93%) of the 4,881 proteins were annotated based on highly significant similarities with E-values = 0 and pid ≥ 40% (see Figure 1 for details). The pairwise alignments of these significant matches were manually reviewed. Additionally, these high quality matches were cross-validated as follows:

Bottom Line: Integration of the two data sets resulted in 7,412 proteins (57%) being annotated with 1,957 distinct and specific GO terms.Unannotated proteins were assigned to the 3 root terms.Our analysis provides comprehensive and robust GO annotations of the M. oryzae genome assemblies that will be solid foundations for further functional interrogation of M. oryzae.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Integrated Fungal Research, North Carolina State University, Raleigh NC 27695, USA. mengs@med.unc.edu

ABSTRACT

Background: Magnaporthe oryzae, the causal agent of blast disease of rice, is the most destructive disease of rice worldwide. The genome of this fungal pathogen has been sequenced and an automated annotation has recently been updated to Version 6 http://www.broad.mit.edu/annotation/genome/magnaporthe_grisea/MultiDownloads.html. However, a comprehensive manual curation remains to be performed. Gene Ontology (GO) annotation is a valuable means of assigning functional information using standardized vocabulary. We report an overview of the GO annotation for Version 5 of M. oryzae genome assembly.

Methods: A similarity-based (i.e., computational) GO annotation with manual review was conducted, which was then integrated with a literature-based GO annotation with computational assistance. For similarity-based GO annotation a stringent reciprocal best hits method was used to identify similarity between predicted proteins of M. oryzae and GO proteins from multiple organisms with published associations to GO terms. Significant alignment pairs were manually reviewed. Functional assignments were further cross-validated with manually reviewed data, conserved domains, or data determined by wet lab experiments. Additionally, biological appropriateness of the functional assignments was manually checked.

Results: In total, 6,286 proteins received GO term assignment via the homology-based annotation, including 2,870 hypothetical proteins. Literature-based experimental evidence, such as microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation, resulted in 2,810 proteins being annotated with GO terms. Of these, 1,673 proteins were annotated with new terms developed for Plant-Associated Microbe Gene Ontology (PAMGO). In addition, 67 experiment-determined secreted proteins were annotated with PAMGO terms. Integration of the two data sets resulted in 7,412 proteins (57%) being annotated with 1,957 distinct and specific GO terms. Unannotated proteins were assigned to the 3 root terms. The Version 5 GO annotation is publically queryable via the GO site http://amigo.geneontology.org/cgi-bin/amigo/go.cgi. Additionally, the genome of M. oryzae is constantly being refined and updated as new information is incorporated. For the latest GO annotation of Version 6 genome, please visit our website http://scotland.fgl.ncsu.edu/smeng/GoAnnotationMagnaporthegrisea.html. The preliminary GO annotation of Version 6 genome is placed at a local MySql database that is publically queryable via a user-friendly interface Adhoc Query System.

Conclusion: Our analysis provides comprehensive and robust GO annotations of the M. oryzae genome assemblies that will be solid foundations for further functional interrogation of M. oryzae.

Show MeSH
Related in: MedlinePlus