Limits...
PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor.

Heazlewood JL, Durek P, Hummel J, Selbig J, Weckwerth W, Walther D, Schulze WX - Nucleic Acids Res. (2007)

Bottom Line: The database is searchable by protein accession number, physical peptide characteristics, as well as by experimental conditions (tissue sampled, phosphopeptide enrichment method).An analysis of the current annotated Arabidopsis proteome yielded in 27,782 predicted phosphoserine sites distributed across 17,035 proteins.These prediction results are summarized graphically in the database together with the experimental phosphorylation sites in a whole sequence context.

View Article: PubMed Central - PubMed

Affiliation: ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Crawley 6009, WA, Australia.

ABSTRACT
The PhosPhAt database provides a resource consolidating our current knowledge of mass spectrometry-based identified phosphorylation sites in Arabidopsis and combines it with phosphorylation site prediction specifically trained on experimentally identified Arabidopsis phosphorylation motifs. The database currently contains 1187 unique tryptic peptide sequences encompassing 1053 Arabidopsis proteins. Among the characterized phosphorylation sites, there are over 1000 with unambiguous site assignments, and nearly 500 for which the precise phosphorylation site could not be determined. The database is searchable by protein accession number, physical peptide characteristics, as well as by experimental conditions (tissue sampled, phosphopeptide enrichment method). For each protein, a phosphorylation site overview is presented in tabular form with detailed information on each identified phosphopeptide. We have utilized a set of 802 experimentally validated serine phosphorylation sites to develop a method for prediction of serine phosphorylation (pSer) in Arabidopsis. An analysis of the current annotated Arabidopsis proteome yielded in 27,782 predicted phosphoserine sites distributed across 17,035 proteins. These prediction results are summarized graphically in the database together with the experimental phosphorylation sites in a whole sequence context. The Arabidopsis Protein Phosphorylation Site Database (PhosPhAt) provides a valuable resource to the plant science community and can be accessed through the following link http://phosphat.mpimp-golm.mpg.de.

Show MeSH
Negative log(P-values) from Fisher exact test on the occurrences of GO: function terms associated with predicted phosphoproteins. P-values were corrected for multiple testing by using the False Discovery Rate (FDR) formalism (25). Overrepresented GO: terms are colored red, underrepresented blue. GO: terms were included if pFDR <0.001. GO annotations were taken from TAIR (19). To avoid training bias, phosphorylation sites used during the training of the classifier have been removed in the Fisher exact test. Only GO assignments with evidence categories: direct assay, mutant phenotype, physical and genetic interaction as well as sequence of structural similarity have been considered.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2238998&req=5

Figure 5: Negative log(P-values) from Fisher exact test on the occurrences of GO: function terms associated with predicted phosphoproteins. P-values were corrected for multiple testing by using the False Discovery Rate (FDR) formalism (25). Overrepresented GO: terms are colored red, underrepresented blue. GO: terms were included if pFDR <0.001. GO annotations were taken from TAIR (19). To avoid training bias, phosphorylation sites used during the training of the classifier have been removed in the Fisher exact test. Only GO assignments with evidence categories: direct assay, mutant phenotype, physical and genetic interaction as well as sequence of structural similarity have been considered.

Mentions: In order to test for over- and under-representation of predicted phosphorylation sites in different functional categories based on GO annotations (23), we applied the Fisher exact test to the GO-term classified prediction result. Proteins involved in regulatory and signaling processes are significantly overrepresented in the set of highly confident phosphorylated proteins while housekeeping and other enzymatic functions are underrepresented (Figure 5).Figure 5.


PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor.

Heazlewood JL, Durek P, Hummel J, Selbig J, Weckwerth W, Walther D, Schulze WX - Nucleic Acids Res. (2007)

Negative log(P-values) from Fisher exact test on the occurrences of GO: function terms associated with predicted phosphoproteins. P-values were corrected for multiple testing by using the False Discovery Rate (FDR) formalism (25). Overrepresented GO: terms are colored red, underrepresented blue. GO: terms were included if pFDR <0.001. GO annotations were taken from TAIR (19). To avoid training bias, phosphorylation sites used during the training of the classifier have been removed in the Fisher exact test. Only GO assignments with evidence categories: direct assay, mutant phenotype, physical and genetic interaction as well as sequence of structural similarity have been considered.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2238998&req=5

Figure 5: Negative log(P-values) from Fisher exact test on the occurrences of GO: function terms associated with predicted phosphoproteins. P-values were corrected for multiple testing by using the False Discovery Rate (FDR) formalism (25). Overrepresented GO: terms are colored red, underrepresented blue. GO: terms were included if pFDR <0.001. GO annotations were taken from TAIR (19). To avoid training bias, phosphorylation sites used during the training of the classifier have been removed in the Fisher exact test. Only GO assignments with evidence categories: direct assay, mutant phenotype, physical and genetic interaction as well as sequence of structural similarity have been considered.
Mentions: In order to test for over- and under-representation of predicted phosphorylation sites in different functional categories based on GO annotations (23), we applied the Fisher exact test to the GO-term classified prediction result. Proteins involved in regulatory and signaling processes are significantly overrepresented in the set of highly confident phosphorylated proteins while housekeeping and other enzymatic functions are underrepresented (Figure 5).Figure 5.

Bottom Line: The database is searchable by protein accession number, physical peptide characteristics, as well as by experimental conditions (tissue sampled, phosphopeptide enrichment method).An analysis of the current annotated Arabidopsis proteome yielded in 27,782 predicted phosphoserine sites distributed across 17,035 proteins.These prediction results are summarized graphically in the database together with the experimental phosphorylation sites in a whole sequence context.

View Article: PubMed Central - PubMed

Affiliation: ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Crawley 6009, WA, Australia.

ABSTRACT
The PhosPhAt database provides a resource consolidating our current knowledge of mass spectrometry-based identified phosphorylation sites in Arabidopsis and combines it with phosphorylation site prediction specifically trained on experimentally identified Arabidopsis phosphorylation motifs. The database currently contains 1187 unique tryptic peptide sequences encompassing 1053 Arabidopsis proteins. Among the characterized phosphorylation sites, there are over 1000 with unambiguous site assignments, and nearly 500 for which the precise phosphorylation site could not be determined. The database is searchable by protein accession number, physical peptide characteristics, as well as by experimental conditions (tissue sampled, phosphopeptide enrichment method). For each protein, a phosphorylation site overview is presented in tabular form with detailed information on each identified phosphopeptide. We have utilized a set of 802 experimentally validated serine phosphorylation sites to develop a method for prediction of serine phosphorylation (pSer) in Arabidopsis. An analysis of the current annotated Arabidopsis proteome yielded in 27,782 predicted phosphoserine sites distributed across 17,035 proteins. These prediction results are summarized graphically in the database together with the experimental phosphorylation sites in a whole sequence context. The Arabidopsis Protein Phosphorylation Site Database (PhosPhAt) provides a valuable resource to the plant science community and can be accessed through the following link http://phosphat.mpimp-golm.mpg.de.

Show MeSH