Limits...
Computational Identification of Protein Pupylation Sites by Using Profile-Based Composition of k-Spaced Amino Acid Pairs.

Hasan MM, Zhou Y, Lu X, Li J, Song J, Zhang Z - PLoS ONE (2015)

Bottom Line: Then, a Support Vector Machine (SVM) classifier is trained using the pbCKSAAP encoding scheme.The final pbPUP predictor achieves an AUC value of 0.849 in 10-fold cross-validation tests and outperforms other existing predictors on a comprehensive independent test dataset.The proposed method is anticipated to be a helpful computational resource for the prediction of pupylation sites.

View Article: PubMed Central - PubMed

Affiliation: State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.

ABSTRACT
Prokaryotic proteins are regulated by pupylation, a type of post-translational modification that contributes to cellular function in bacterial organisms. In pupylation process, the prokaryotic ubiquitin-like protein (Pup) tagging is functionally analogous to ubiquitination in order to tag target proteins for proteasomal degradation. To date, several experimental methods have been developed to identify pupylated proteins and their pupylation sites, but these experimental methods are generally laborious and costly. Therefore, computational methods that can accurately predict potential pupylation sites based on protein sequence information are highly desirable. In this paper, a novel predictor termed as pbPUP has been developed for accurate prediction of pupylation sites. In particular, a sophisticated sequence encoding scheme [i.e. the profile-based composition of k-spaced amino acid pairs (pbCKSAAP)] is used to represent the sequence patterns and evolutionary information of the sequence fragments surrounding pupylation sites. Then, a Support Vector Machine (SVM) classifier is trained using the pbCKSAAP encoding scheme. The final pbPUP predictor achieves an AUC value of 0.849 in 10-fold cross-validation tests and outperforms other existing predictors on a comprehensive independent test dataset. The proposed method is anticipated to be a helpful computational resource for the prediction of pupylation sites. The web server and curated datasets in this study are freely available at http://protein.cau.edu.cn/pbPUP/.

No MeSH data available.


Comparison of the selected features in pbCKSAAP and CKSAAP using the χ² feature selection method.(A) Feature scores of pbCKSAAP and CKSAAP; (B) The numbers of selected features in pbCKSAAP and CKSAAP with the same feature selection score cutoff χ²≥3.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4469302&req=5

pone.0129635.g004: Comparison of the selected features in pbCKSAAP and CKSAAP using the χ² feature selection method.(A) Feature scores of pbCKSAAP and CKSAAP; (B) The numbers of selected features in pbCKSAAP and CKSAAP with the same feature selection score cutoff χ²≥3.

Mentions: To further compare pbCKSAAP with CKSAAP, the χ² feature selection method was applied to select the most important pbCKSAAP and CKSAAP features. In particular, we found that the average χ² feature score of pbCKSAAP features was much higher than that of CKSAAP features (Fig 4A). This suggests that the pbCKSAAP features contained more important information than the CKSAAP features. To make a stringent comparison, we used the same feature score cutoff (i.e. χ²≥3) to select more informative features from both CKSAAP and pbCKSAAP sequence encodings. When this cutoff was applied, the number of selected pbCKSAAP features was 196, while the number of selected CKSAAP features was only 169 (Fig 4B). The number of common features shared by both pbCKSAAP and CKSAAP was 45 (Fig 4B). In summary, we conclude that pbCKSAAP contained more informative features than CKSAAP, which provides an important evidence to explain the better performance of pbCKSAAP.


Computational Identification of Protein Pupylation Sites by Using Profile-Based Composition of k-Spaced Amino Acid Pairs.

Hasan MM, Zhou Y, Lu X, Li J, Song J, Zhang Z - PLoS ONE (2015)

Comparison of the selected features in pbCKSAAP and CKSAAP using the χ² feature selection method.(A) Feature scores of pbCKSAAP and CKSAAP; (B) The numbers of selected features in pbCKSAAP and CKSAAP with the same feature selection score cutoff χ²≥3.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4469302&req=5

pone.0129635.g004: Comparison of the selected features in pbCKSAAP and CKSAAP using the χ² feature selection method.(A) Feature scores of pbCKSAAP and CKSAAP; (B) The numbers of selected features in pbCKSAAP and CKSAAP with the same feature selection score cutoff χ²≥3.
Mentions: To further compare pbCKSAAP with CKSAAP, the χ² feature selection method was applied to select the most important pbCKSAAP and CKSAAP features. In particular, we found that the average χ² feature score of pbCKSAAP features was much higher than that of CKSAAP features (Fig 4A). This suggests that the pbCKSAAP features contained more important information than the CKSAAP features. To make a stringent comparison, we used the same feature score cutoff (i.e. χ²≥3) to select more informative features from both CKSAAP and pbCKSAAP sequence encodings. When this cutoff was applied, the number of selected pbCKSAAP features was 196, while the number of selected CKSAAP features was only 169 (Fig 4B). The number of common features shared by both pbCKSAAP and CKSAAP was 45 (Fig 4B). In summary, we conclude that pbCKSAAP contained more informative features than CKSAAP, which provides an important evidence to explain the better performance of pbCKSAAP.

Bottom Line: Then, a Support Vector Machine (SVM) classifier is trained using the pbCKSAAP encoding scheme.The final pbPUP predictor achieves an AUC value of 0.849 in 10-fold cross-validation tests and outperforms other existing predictors on a comprehensive independent test dataset.The proposed method is anticipated to be a helpful computational resource for the prediction of pupylation sites.

View Article: PubMed Central - PubMed

Affiliation: State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, 100193, China.

ABSTRACT
Prokaryotic proteins are regulated by pupylation, a type of post-translational modification that contributes to cellular function in bacterial organisms. In pupylation process, the prokaryotic ubiquitin-like protein (Pup) tagging is functionally analogous to ubiquitination in order to tag target proteins for proteasomal degradation. To date, several experimental methods have been developed to identify pupylated proteins and their pupylation sites, but these experimental methods are generally laborious and costly. Therefore, computational methods that can accurately predict potential pupylation sites based on protein sequence information are highly desirable. In this paper, a novel predictor termed as pbPUP has been developed for accurate prediction of pupylation sites. In particular, a sophisticated sequence encoding scheme [i.e. the profile-based composition of k-spaced amino acid pairs (pbCKSAAP)] is used to represent the sequence patterns and evolutionary information of the sequence fragments surrounding pupylation sites. Then, a Support Vector Machine (SVM) classifier is trained using the pbCKSAAP encoding scheme. The final pbPUP predictor achieves an AUC value of 0.849 in 10-fold cross-validation tests and outperforms other existing predictors on a comprehensive independent test dataset. The proposed method is anticipated to be a helpful computational resource for the prediction of pupylation sites. The web server and curated datasets in this study are freely available at http://protein.cau.edu.cn/pbPUP/.

No MeSH data available.