Limits...
Prioritizing functional phosphorylation sites based on multiple feature integration.

Xiao Q, Miao B, Bi J, Wang Z, Li Y - Sci Rep (2016)

Bottom Line: In recent years, many phosphorylation sites have been identified as a result of advances in mass-spectrometric techniques.We found significant differences in the distribution of evolutionary conservation, kinase association, disorder score, and secondary structure between known functional and background phosphorylation datasets.We built four different types of classifiers based on the most representative features and found that their performances were similar.

View Article: PubMed Central - PubMed

Affiliation: Key Lab of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China.

ABSTRACT
Protein phosphorylation is an important type of post-translational modification that is involved in a variety of biological activities. Most phosphorylation events occur on serine, threonine and tyrosine residues in eukaryotes. In recent years, many phosphorylation sites have been identified as a result of advances in mass-spectrometric techniques. However, a large percentage of phosphorylation sites may be non-functional. Systematically prioritizing functional sites from a large number of phosphorylation sites will be increasingly important for the study of their biological roles. This study focused on exploring the intrinsic features of functional phosphorylation sites to predict whether a phosphosite is likely to be functional. We found significant differences in the distribution of evolutionary conservation, kinase association, disorder score, and secondary structure between known functional and background phosphorylation datasets. We built four different types of classifiers based on the most representative features and found that their performances were similar. We also prioritized 213,837 human phosphorylation sites from a variety of phosphorylation databases, which will be helpful for subsequent functional studies. All predicted results are available for query and download on our website (Predict Functional Phosphosites, PFP, http://pfp.biosino.org/).

No MeSH data available.


Evaluation of the prediction results of the different models.(A) The upper half of the figure shows the relationship between the threshold of prediction score and false-positive rate (FPR). The lower half of the figure shows the relationship between the proportion of predicted positive sites and FPR. The results indicate that a larger threshold of prediction score is often associated with a lower false-positive rate and proportion of predicted positive sites. (B) The Venn diagram of the prediction results of the different models (FPR = 0.1) for all human phosphosites (213,837). The results of the different models show good accordance.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4835696&req=5

f5: Evaluation of the prediction results of the different models.(A) The upper half of the figure shows the relationship between the threshold of prediction score and false-positive rate (FPR). The lower half of the figure shows the relationship between the proportion of predicted positive sites and FPR. The results indicate that a larger threshold of prediction score is often associated with a lower false-positive rate and proportion of predicted positive sites. (B) The Venn diagram of the prediction results of the different models (FPR = 0.1) for all human phosphosites (213,837). The results of the different models show good accordance.

Mentions: We applied all four models to the whole human phosphorylation dataset collected from the PhosphositePlus, PhosphoELM, and dbPTM databases4243. For each site predicted, the model yields a prediction score ranging from 0 to 1, indicating the extent of the prediction to be positive. As different thresholds of the prediction score would give different proportions of predicted positive/negative sites, we showed the relationship between the false-positive rate and the threshold, as well as the predicted proportion of positive sites for different models (Fig. 5A).


Prioritizing functional phosphorylation sites based on multiple feature integration.

Xiao Q, Miao B, Bi J, Wang Z, Li Y - Sci Rep (2016)

Evaluation of the prediction results of the different models.(A) The upper half of the figure shows the relationship between the threshold of prediction score and false-positive rate (FPR). The lower half of the figure shows the relationship between the proportion of predicted positive sites and FPR. The results indicate that a larger threshold of prediction score is often associated with a lower false-positive rate and proportion of predicted positive sites. (B) The Venn diagram of the prediction results of the different models (FPR = 0.1) for all human phosphosites (213,837). The results of the different models show good accordance.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4835696&req=5

f5: Evaluation of the prediction results of the different models.(A) The upper half of the figure shows the relationship between the threshold of prediction score and false-positive rate (FPR). The lower half of the figure shows the relationship between the proportion of predicted positive sites and FPR. The results indicate that a larger threshold of prediction score is often associated with a lower false-positive rate and proportion of predicted positive sites. (B) The Venn diagram of the prediction results of the different models (FPR = 0.1) for all human phosphosites (213,837). The results of the different models show good accordance.
Mentions: We applied all four models to the whole human phosphorylation dataset collected from the PhosphositePlus, PhosphoELM, and dbPTM databases4243. For each site predicted, the model yields a prediction score ranging from 0 to 1, indicating the extent of the prediction to be positive. As different thresholds of the prediction score would give different proportions of predicted positive/negative sites, we showed the relationship between the false-positive rate and the threshold, as well as the predicted proportion of positive sites for different models (Fig. 5A).

Bottom Line: In recent years, many phosphorylation sites have been identified as a result of advances in mass-spectrometric techniques.We found significant differences in the distribution of evolutionary conservation, kinase association, disorder score, and secondary structure between known functional and background phosphorylation datasets.We built four different types of classifiers based on the most representative features and found that their performances were similar.

View Article: PubMed Central - PubMed

Affiliation: Key Lab of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, P. R. China.

ABSTRACT
Protein phosphorylation is an important type of post-translational modification that is involved in a variety of biological activities. Most phosphorylation events occur on serine, threonine and tyrosine residues in eukaryotes. In recent years, many phosphorylation sites have been identified as a result of advances in mass-spectrometric techniques. However, a large percentage of phosphorylation sites may be non-functional. Systematically prioritizing functional sites from a large number of phosphorylation sites will be increasingly important for the study of their biological roles. This study focused on exploring the intrinsic features of functional phosphorylation sites to predict whether a phosphosite is likely to be functional. We found significant differences in the distribution of evolutionary conservation, kinase association, disorder score, and secondary structure between known functional and background phosphorylation datasets. We built four different types of classifiers based on the most representative features and found that their performances were similar. We also prioritized 213,837 human phosphorylation sites from a variety of phosphorylation databases, which will be helpful for subsequent functional studies. All predicted results are available for query and download on our website (Predict Functional Phosphosites, PFP, http://pfp.biosino.org/).

No MeSH data available.