Limits...
Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites.

Lin S, Song Q, Tao H, Wang W, Wan W, Huang J, Xu C, Chebii V, Kitony J, Que S, Harrison A, He H - Sci Rep (2015)

Bottom Line: Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site.A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms.Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset.

View Article: PubMed Central - PubMed

Affiliation: College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.

ABSTRACT
Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice_Phospho 1.0 (http://bioinformatics.fafu.edu.cn/rice_phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice_Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice_phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community.

No MeSH data available.


ROC curves of predicting performance of SVM with 3 different sole encoding schemes.*In the diagrams, the increased area under the ROC indicates the improved classification performance. The same below.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4493637&req=5

f1: ROC curves of predicting performance of SVM with 3 different sole encoding schemes.*In the diagrams, the increased area under the ROC indicates the improved classification performance. The same below.

Mentions: The performance of the three sole encoding schemes was measured by using different sizes of datasets and with SVM used as the classifier. CKSAAP performed best among the three sole encoding schemes (Fig. 1). However, with the size of dataset increasing, the performance of SVM with CKSAAP decreased, SVM with AF kept fluctuating, while that of SVM with KNN increased (Table 1). The same changing trends (CKSSAP decreasing, AF fluctuating and KNN increasing) in performance of SVM with AF, KNN or CKSAAP was also true when the ratio of (+) sites to (−) sites increased (Table 1).


Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites.

Lin S, Song Q, Tao H, Wang W, Wan W, Huang J, Xu C, Chebii V, Kitony J, Que S, Harrison A, He H - Sci Rep (2015)

ROC curves of predicting performance of SVM with 3 different sole encoding schemes.*In the diagrams, the increased area under the ROC indicates the improved classification performance. The same below.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4493637&req=5

f1: ROC curves of predicting performance of SVM with 3 different sole encoding schemes.*In the diagrams, the increased area under the ROC indicates the improved classification performance. The same below.
Mentions: The performance of the three sole encoding schemes was measured by using different sizes of datasets and with SVM used as the classifier. CKSAAP performed best among the three sole encoding schemes (Fig. 1). However, with the size of dataset increasing, the performance of SVM with CKSAAP decreased, SVM with AF kept fluctuating, while that of SVM with KNN increased (Table 1). The same changing trends (CKSSAP decreasing, AF fluctuating and KNN increasing) in performance of SVM with AF, KNN or CKSAAP was also true when the ratio of (+) sites to (−) sites increased (Table 1).

Bottom Line: Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site.A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms.Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset.

View Article: PubMed Central - PubMed

Affiliation: College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.

ABSTRACT
Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice_Phospho 1.0 (http://bioinformatics.fafu.edu.cn/rice_phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice_Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice_phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community.

No MeSH data available.