Limits...
Better Than Nothing? Limitations of the Prediction Tool SecretomeP in the Search for Leaderless Secretory Proteins (LSPs) in Plants

View Article: PubMed Central - PubMed

ABSTRACT

In proteomic analyses of the plant secretome, the presence of putative leaderless secretory proteins (LSPs) is difficult to confirm due to the possibility of contamination from other sub-cellular compartments. In the absence of a plant-specific tool for predicting LSPs, the mammalian-trained SecretomeP has been applied to plant proteins in multiple studies to identify the most likely LSPs. This study investigates the effectiveness of using SecretomeP on plant proteins, identifies its limitations and provides a benchmark for its use. In the absence of experimentally verified LSPs we exploit the common-feature hypothesis behind SecretomeP and use known classically secreted proteins (CSPs) of plants as a proxy to evaluate its accuracy. We show that, contrary to the common-feature hypothesis, plant CSPs are a poor proxy for evaluating LSP detection due to variation in the SecretomeP prediction scores when the signal peptide (SP) is modified. Removing the SP region from CSPs and comparing the predictive performance against non-secretory proteins indicates that commonly used threshold scores of 0.5 and 0.6 result in false-positive rates in excess of 0.3 when applied to plants proteins. Setting the false-positive rate to 0.05, consistent with the original mammalian performance of SecretomeP, yields only a marginally higher true positive rate compared to false positives. Therefore the use of SecretomeP on plant proteins is not recommended. This study investigates the trade-offs of using SecretomeP on plant proteins and provides insights into predictive features for future development of plant-specific common-feature tools.

No MeSH data available.


The correlation scores (as described in Figure 3B) for protein sub-sets of ASURE. Extracellular (A), plasma membrane (B), nucleus (C), cytosol (D), nucleus and/or cytosol (E), and plastid (F), respectively.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5037178&req=5

Figure 5: The correlation scores (as described in Figure 3B) for protein sub-sets of ASURE. Extracellular (A), plasma membrane (B), nucleus (C), cytosol (D), nucleus and/or cytosol (E), and plastid (F), respectively.

Mentions: Correlations for the ASURE subsets are also informative (Figure 5). The relative positive correlation value for each modification is maintained from the WallProtDB analysis, though mostly higher values were obtained (compare Figure 5 with Figure 3B, Supplementary Table S2). The exception was the Reverse modifications, which showed a small positive correlation, particularly for nucleus, cytosol and the combined sets, and a negative correlation for the WallProtDB and ASURE (Extracellular) sub-sets. This modification does not rely on a substitute SP region, and the difference in scores between positive and negative data suggests that the weak negative correlation found in WallProtDB and ASURE (Extracellular) is a feature of SP-containing proteins.


Better Than Nothing? Limitations of the Prediction Tool SecretomeP in the Search for Leaderless Secretory Proteins (LSPs) in Plants
The correlation scores (as described in Figure 3B) for protein sub-sets of ASURE. Extracellular (A), plasma membrane (B), nucleus (C), cytosol (D), nucleus and/or cytosol (E), and plastid (F), respectively.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5037178&req=5

Figure 5: The correlation scores (as described in Figure 3B) for protein sub-sets of ASURE. Extracellular (A), plasma membrane (B), nucleus (C), cytosol (D), nucleus and/or cytosol (E), and plastid (F), respectively.
Mentions: Correlations for the ASURE subsets are also informative (Figure 5). The relative positive correlation value for each modification is maintained from the WallProtDB analysis, though mostly higher values were obtained (compare Figure 5 with Figure 3B, Supplementary Table S2). The exception was the Reverse modifications, which showed a small positive correlation, particularly for nucleus, cytosol and the combined sets, and a negative correlation for the WallProtDB and ASURE (Extracellular) sub-sets. This modification does not rely on a substitute SP region, and the difference in scores between positive and negative data suggests that the weak negative correlation found in WallProtDB and ASURE (Extracellular) is a feature of SP-containing proteins.

View Article: PubMed Central - PubMed

ABSTRACT

In proteomic analyses of the plant secretome, the presence of putative leaderless secretory proteins (LSPs) is difficult to confirm due to the possibility of contamination from other sub-cellular compartments. In the absence of a plant-specific tool for predicting LSPs, the mammalian-trained SecretomeP has been applied to plant proteins in multiple studies to identify the most likely LSPs. This study investigates the effectiveness of using SecretomeP on plant proteins, identifies its limitations and provides a benchmark for its use. In the absence of experimentally verified LSPs we exploit the common-feature hypothesis behind SecretomeP and use known classically secreted proteins (CSPs) of plants as a proxy to evaluate its accuracy. We show that, contrary to the common-feature hypothesis, plant CSPs are a poor proxy for evaluating LSP detection due to variation in the SecretomeP prediction scores when the signal peptide (SP) is modified. Removing the SP region from CSPs and comparing the predictive performance against non-secretory proteins indicates that commonly used threshold scores of 0.5 and 0.6 result in false-positive rates in excess of 0.3 when applied to plants proteins. Setting the false-positive rate to 0.05, consistent with the original mammalian performance of SecretomeP, yields only a marginally higher true positive rate compared to false positives. Therefore the use of SecretomeP on plant proteins is not recommended. This study investigates the trade-offs of using SecretomeP on plant proteins and provides insights into predictive features for future development of plant-specific common-feature tools.

No MeSH data available.