Limits...
A multi-label classifier for predicting the subcellular localization of gram-negative bacterial proteins with both single and multiple sites.

Xiao X, Wu ZC, Chou KC - PLoS ONE (2011)

Bottom Line: Of the 1,392 proteins, 1,328 are each with only one subcellular location and the other 64 are each with two subcellular locations, but none of the proteins included has pairwise sequence identity to any other in a same subset (subcellular location).It was observed that the overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gneg was over 91%, which is about 6% higher than that by Gneg-mPLoc.It is anticipated that iLoc-Gneg may become a useful high throughput tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development.

View Article: PubMed Central - PubMed

Affiliation: Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, China. xiaoxuan0326@yahoo.com.cn

ABSTRACT
Prediction of protein subcellular localization is a challenging problem, particularly when the system concerned contains both singleplex and multiplex proteins. In this paper, by introducing the "multi-label scale" and hybridizing the information of gene ontology with the sequential evolution information, a novel predictor called iLoc-Gneg is developed for predicting the subcellular localization of gram-positive bacterial proteins with both single-location and multiple-location sites. For facilitating comparison, the same stringent benchmark dataset used to estimate the accuracy of Gneg-mPLoc was adopted to demonstrate the power of iLoc-Gneg. The dataset contains 1,392 gram-negative bacterial proteins classified into the following eight locations: (1) cytoplasm, (2) extracellular, (3) fimbrium, (4) flagellum, (5) inner membrane, (6) nucleoid, (7) outer membrane, and (8) periplasm. Of the 1,392 proteins, 1,328 are each with only one subcellular location and the other 64 are each with two subcellular locations, but none of the proteins included has pairwise sequence identity to any other in a same subset (subcellular location). It was observed that the overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gneg was over 91%, which is about 6% higher than that by Gneg-mPLoc. As a user-friendly web-server, iLoc-Gneg is freely accessible to the public at http://icpr.jci.edu.cn/bioinfo/iLoc-Gneg. Meanwhile, a step-by-step guide is provided on how to use the web-server to get the desired results. Furthermore, for the user's convenience, the iLoc-Gneg web-server also has the function to accept the batch job submission, which is not available in the existing version of Gneg-mPLoc web-server. It is anticipated that iLoc-Gneg may become a useful high throughput tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development.

Show MeSH
A semi-screenshot to show the output of iLoc-Gneg.The input was taken from the three protein sequences listed in the Example window of the iLoc-Gneg web-server (cf. Fig. 3).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3117797&req=5

pone-0020592-g004: A semi-screenshot to show the output of iLoc-Gneg.The input was taken from the three protein sequences listed in the Example window of the iLoc-Gneg web-server (cf. Fig. 3).

Mentions: Click on the Submit button to see the predicted result. For example, if you use the three query protein sequences in the Example window as the input, after clicking the Submit button, you will see Fig. 4 shown on your screen, indicating that the predicted result for the 1st query protein is “Cell outer membrane”, that for the 2nd one is “Cytoplasm; Periplasm”, and that for the 3rd one is “Cell inner membrane; Cytoplasm”. In other words, the 1st query protein (P0A3N8) is a single-location one residing at “cell outer membrane” only, the 2nd one (Q05097) can simultaneously reside in two different sites (“cytoplasm” and “periplasm”), and the 3rd one (P61380) can also simultaneously reside in two different sites (“cell inner membrane” and “cytoplasm”). All these results are exactly the same as observed by experiments as shown in the Supporting Information S1. It takes about 10 seconds for the above computation before the predicted results appear on your computer screen; the more number of query proteins and longer of each sequence, the more time it is usually needed.


A multi-label classifier for predicting the subcellular localization of gram-negative bacterial proteins with both single and multiple sites.

Xiao X, Wu ZC, Chou KC - PLoS ONE (2011)

A semi-screenshot to show the output of iLoc-Gneg.The input was taken from the three protein sequences listed in the Example window of the iLoc-Gneg web-server (cf. Fig. 3).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3117797&req=5

pone-0020592-g004: A semi-screenshot to show the output of iLoc-Gneg.The input was taken from the three protein sequences listed in the Example window of the iLoc-Gneg web-server (cf. Fig. 3).
Mentions: Click on the Submit button to see the predicted result. For example, if you use the three query protein sequences in the Example window as the input, after clicking the Submit button, you will see Fig. 4 shown on your screen, indicating that the predicted result for the 1st query protein is “Cell outer membrane”, that for the 2nd one is “Cytoplasm; Periplasm”, and that for the 3rd one is “Cell inner membrane; Cytoplasm”. In other words, the 1st query protein (P0A3N8) is a single-location one residing at “cell outer membrane” only, the 2nd one (Q05097) can simultaneously reside in two different sites (“cytoplasm” and “periplasm”), and the 3rd one (P61380) can also simultaneously reside in two different sites (“cell inner membrane” and “cytoplasm”). All these results are exactly the same as observed by experiments as shown in the Supporting Information S1. It takes about 10 seconds for the above computation before the predicted results appear on your computer screen; the more number of query proteins and longer of each sequence, the more time it is usually needed.

Bottom Line: Of the 1,392 proteins, 1,328 are each with only one subcellular location and the other 64 are each with two subcellular locations, but none of the proteins included has pairwise sequence identity to any other in a same subset (subcellular location).It was observed that the overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gneg was over 91%, which is about 6% higher than that by Gneg-mPLoc.It is anticipated that iLoc-Gneg may become a useful high throughput tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development.

View Article: PubMed Central - PubMed

Affiliation: Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, China. xiaoxuan0326@yahoo.com.cn

ABSTRACT
Prediction of protein subcellular localization is a challenging problem, particularly when the system concerned contains both singleplex and multiplex proteins. In this paper, by introducing the "multi-label scale" and hybridizing the information of gene ontology with the sequential evolution information, a novel predictor called iLoc-Gneg is developed for predicting the subcellular localization of gram-positive bacterial proteins with both single-location and multiple-location sites. For facilitating comparison, the same stringent benchmark dataset used to estimate the accuracy of Gneg-mPLoc was adopted to demonstrate the power of iLoc-Gneg. The dataset contains 1,392 gram-negative bacterial proteins classified into the following eight locations: (1) cytoplasm, (2) extracellular, (3) fimbrium, (4) flagellum, (5) inner membrane, (6) nucleoid, (7) outer membrane, and (8) periplasm. Of the 1,392 proteins, 1,328 are each with only one subcellular location and the other 64 are each with two subcellular locations, but none of the proteins included has pairwise sequence identity to any other in a same subset (subcellular location). It was observed that the overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gneg was over 91%, which is about 6% higher than that by Gneg-mPLoc. As a user-friendly web-server, iLoc-Gneg is freely accessible to the public at http://icpr.jci.edu.cn/bioinfo/iLoc-Gneg. Meanwhile, a step-by-step guide is provided on how to use the web-server to get the desired results. Furthermore, for the user's convenience, the iLoc-Gneg web-server also has the function to accept the batch job submission, which is not available in the existing version of Gneg-mPLoc web-server. It is anticipated that iLoc-Gneg may become a useful high throughput tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development.

Show MeSH