Limits...
Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites.

Lin S, Song Q, Tao H, Wang W, Wan W, Huang J, Xu C, Chebii V, Kitony J, Que S, Harrison A, He H - Sci Rep (2015)

Bottom Line: Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site.A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms.Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset.

View Article: PubMed Central - PubMed

Affiliation: College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.

ABSTRACT
Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice_Phospho 1.0 (http://bioinformatics.fafu.edu.cn/rice_phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice_Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice_phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community.

No MeSH data available.


ROC curves of predicting performance of SVM with the combining encoding schemes.*A. ROC curves of SVM with AF-CKSAAP, AF and CKSAAP. B. ROC curves of SVM with AF-KNN, AF and KNN. C. ROC curves of SVM with CKSAAP-KNN, KNN and CKSAAP.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4493637&req=5

f2: ROC curves of predicting performance of SVM with the combining encoding schemes.*A. ROC curves of SVM with AF-CKSAAP, AF and CKSAAP. B. ROC curves of SVM with AF-KNN, AF and KNN. C. ROC curves of SVM with CKSAAP-KNN, KNN and CKSAAP.

Mentions: The performance of AF combined with CKAAP (AF-CKSAAP) was better than the sole encoding scheme, AF or CKSAAP (Fig. 2A). The same was true for AF combined with KNN (AF-KNN) (Fig. 2B). However, CKSAAP combined with KNN (CKSAAP-KNN) outperformed KNN, but did not outperform CKSAAP (Fig. 2C). In the preliminary experiment, we found that the combination of all the three encoding schemes did not significantly outperform CKSAAP (Data not shown) but increased feature dimensions. This result implies that AF, KNN and CKSAAP might be complementary to each other to some extent, especially AF and CKSAAP.


Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites.

Lin S, Song Q, Tao H, Wang W, Wan W, Huang J, Xu C, Chebii V, Kitony J, Que S, Harrison A, He H - Sci Rep (2015)

ROC curves of predicting performance of SVM with the combining encoding schemes.*A. ROC curves of SVM with AF-CKSAAP, AF and CKSAAP. B. ROC curves of SVM with AF-KNN, AF and KNN. C. ROC curves of SVM with CKSAAP-KNN, KNN and CKSAAP.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4493637&req=5

f2: ROC curves of predicting performance of SVM with the combining encoding schemes.*A. ROC curves of SVM with AF-CKSAAP, AF and CKSAAP. B. ROC curves of SVM with AF-KNN, AF and KNN. C. ROC curves of SVM with CKSAAP-KNN, KNN and CKSAAP.
Mentions: The performance of AF combined with CKAAP (AF-CKSAAP) was better than the sole encoding scheme, AF or CKSAAP (Fig. 2A). The same was true for AF combined with KNN (AF-KNN) (Fig. 2B). However, CKSAAP combined with KNN (CKSAAP-KNN) outperformed KNN, but did not outperform CKSAAP (Fig. 2C). In the preliminary experiment, we found that the combination of all the three encoding schemes did not significantly outperform CKSAAP (Data not shown) but increased feature dimensions. This result implies that AF, KNN and CKSAAP might be complementary to each other to some extent, especially AF and CKSAAP.

Bottom Line: Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site.A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms.Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset.

View Article: PubMed Central - PubMed

Affiliation: College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.

ABSTRACT
Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice_Phospho 1.0 (http://bioinformatics.fafu.edu.cn/rice_phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice_Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice_phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community.

No MeSH data available.