Limits...
Identification of Extracellular Segments by Mass Spectrometry Improves Topology Prediction of Transmembrane Proteins

View Article: PubMed Central - PubMed

ABSTRACT

Transmembrane proteins play crucial role in signaling, ion transport, nutrient uptake, as well as in maintaining the dynamic equilibrium between the internal and external environment of cells. Despite their important biological functions and abundance, less than 2% of all determined structures are transmembrane proteins. Given the persisting technical difficulties associated with high resolution structure determination of transmembrane proteins, additional methods, including computational and experimental techniques remain vital in promoting our understanding of their topologies, 3D structures, functions and interactions. Here we report a method for the high-throughput determination of extracellular segments of transmembrane proteins based on the identification of surface labeled and biotin captured peptide fragments by LC/MS/MS. We show that reliable identification of extracellular protein segments increases the accuracy and reliability of existing topology prediction algorithms. Using the experimental topology data as constraints, our improved prediction tool provides accurate and reliable topology models for hundreds of human transmembrane proteins.

No MeSH data available.


Effect of lysine constraints on the accuracy of topology prediction.Effect of constraints on the topology prediction accuracy of CCTOP on the experimental benchmark set. (A) Prediction accuracy versus percent of extracellular lysines used as constraints in the prediction. 0, 4, 8, 12, 16% of the used extracellular lysines were replaced by intracellular lysines (blue, green, orange, red and black line, respectively) (B). Predictions were sorted according to their reliability values, then the accuracies were calculated on proteins with the largest 1, 2, 3 … 333 reliability values, represented as coverage from 0 to 100% on the x-axis of the plot. The colors of the curves are coded according to the ratio of randomly selected extracellular lysine (blue, green, orange, red and black for 100, 75, 50, 25 and 0%, respectively). Averages are plotted with continuous lines, standard deviations are shaded.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5304180&req=5

f3: Effect of lysine constraints on the accuracy of topology prediction.Effect of constraints on the topology prediction accuracy of CCTOP on the experimental benchmark set. (A) Prediction accuracy versus percent of extracellular lysines used as constraints in the prediction. 0, 4, 8, 12, 16% of the used extracellular lysines were replaced by intracellular lysines (blue, green, orange, red and black line, respectively) (B). Predictions were sorted according to their reliability values, then the accuracies were calculated on proteins with the largest 1, 2, 3 … 333 reliability values, represented as coverage from 0 to 100% on the x-axis of the plot. The colors of the curves are coded according to the ratio of randomly selected extracellular lysine (blue, green, orange, red and black for 100, 75, 50, 25 and 0%, respectively). Averages are plotted with continuous lines, standard deviations are shaded.

Mentions: To measure the potential impact of our experiments on topology predictions, we analyzed a human TMP benchmark set consisting of 333 human TMP sequences3. The topology of these TMPs was established based on the available 3D structures of the same or homologous proteins. The set contains 8099 extracellular and 4892 intracellular lysines. Using the CCTOP algorithm, we simulated topology predictions taking into account an increasing number of extracellular lysines as constraints11. To avoid any bias, all further computational and experimental constraints were neglected, and only the selected lysines were considered. We selected 25%, 50% and 75% of extracellular lysines by chance, and compared the results of the predictions to the established topology of the benchmark TMPs, using only these randomly selected extracellular lysine residues as constraints (Fig. 3A, blue line). The randomizations were repeated 50 times to calculate the average and the standard deviation of the prediction accuracy and reliability (Fig. 3B, Supplementary Figure 7). To assess the theoretical limits of our approach, a simulation was run in which all extracellular lysines were considered as constraints (100% on the plots). As expected, the accuracy of the topology predictions was significantly improved by involving extracellular lysines as constraints (Fig. 3). The simulations suggest that the maximal benefit is a 23% increase in the prediction accuracy (from 56% to 79%) (Fig. 3A, blue line), which would occur with the labeling of all extracellular lysines. By limiting the constraints to 20% of the extracellular lysines (corresponding to the percent of labeled lysines in our experiments), the accuracy of the topology predictions is still increased by 14% (from 56% to 70%). To simulate the effect of erroneously identified positions on prediction accuracy, we corrupted the prediction algorithm by replacing 4, 8, 12 and 16% of the randomly selected 25, 50, 75 and 100% extracellular lysines with intracellular lysines. As shown in Fig. 3A, false positive constrains have a drastic effect, resulting in an actual decline of the prediction accuracy.


Identification of Extracellular Segments by Mass Spectrometry Improves Topology Prediction of Transmembrane Proteins
Effect of lysine constraints on the accuracy of topology prediction.Effect of constraints on the topology prediction accuracy of CCTOP on the experimental benchmark set. (A) Prediction accuracy versus percent of extracellular lysines used as constraints in the prediction. 0, 4, 8, 12, 16% of the used extracellular lysines were replaced by intracellular lysines (blue, green, orange, red and black line, respectively) (B). Predictions were sorted according to their reliability values, then the accuracies were calculated on proteins with the largest 1, 2, 3 … 333 reliability values, represented as coverage from 0 to 100% on the x-axis of the plot. The colors of the curves are coded according to the ratio of randomly selected extracellular lysine (blue, green, orange, red and black for 100, 75, 50, 25 and 0%, respectively). Averages are plotted with continuous lines, standard deviations are shaded.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5304180&req=5

f3: Effect of lysine constraints on the accuracy of topology prediction.Effect of constraints on the topology prediction accuracy of CCTOP on the experimental benchmark set. (A) Prediction accuracy versus percent of extracellular lysines used as constraints in the prediction. 0, 4, 8, 12, 16% of the used extracellular lysines were replaced by intracellular lysines (blue, green, orange, red and black line, respectively) (B). Predictions were sorted according to their reliability values, then the accuracies were calculated on proteins with the largest 1, 2, 3 … 333 reliability values, represented as coverage from 0 to 100% on the x-axis of the plot. The colors of the curves are coded according to the ratio of randomly selected extracellular lysine (blue, green, orange, red and black for 100, 75, 50, 25 and 0%, respectively). Averages are plotted with continuous lines, standard deviations are shaded.
Mentions: To measure the potential impact of our experiments on topology predictions, we analyzed a human TMP benchmark set consisting of 333 human TMP sequences3. The topology of these TMPs was established based on the available 3D structures of the same or homologous proteins. The set contains 8099 extracellular and 4892 intracellular lysines. Using the CCTOP algorithm, we simulated topology predictions taking into account an increasing number of extracellular lysines as constraints11. To avoid any bias, all further computational and experimental constraints were neglected, and only the selected lysines were considered. We selected 25%, 50% and 75% of extracellular lysines by chance, and compared the results of the predictions to the established topology of the benchmark TMPs, using only these randomly selected extracellular lysine residues as constraints (Fig. 3A, blue line). The randomizations were repeated 50 times to calculate the average and the standard deviation of the prediction accuracy and reliability (Fig. 3B, Supplementary Figure 7). To assess the theoretical limits of our approach, a simulation was run in which all extracellular lysines were considered as constraints (100% on the plots). As expected, the accuracy of the topology predictions was significantly improved by involving extracellular lysines as constraints (Fig. 3). The simulations suggest that the maximal benefit is a 23% increase in the prediction accuracy (from 56% to 79%) (Fig. 3A, blue line), which would occur with the labeling of all extracellular lysines. By limiting the constraints to 20% of the extracellular lysines (corresponding to the percent of labeled lysines in our experiments), the accuracy of the topology predictions is still increased by 14% (from 56% to 70%). To simulate the effect of erroneously identified positions on prediction accuracy, we corrupted the prediction algorithm by replacing 4, 8, 12 and 16% of the randomly selected 25, 50, 75 and 100% extracellular lysines with intracellular lysines. As shown in Fig. 3A, false positive constrains have a drastic effect, resulting in an actual decline of the prediction accuracy.

View Article: PubMed Central - PubMed

ABSTRACT

Transmembrane proteins play crucial role in signaling, ion transport, nutrient uptake, as well as in maintaining the dynamic equilibrium between the internal and external environment of cells. Despite their important biological functions and abundance, less than 2% of all determined structures are transmembrane proteins. Given the persisting technical difficulties associated with high resolution structure determination of transmembrane proteins, additional methods, including computational and experimental techniques remain vital in promoting our understanding of their topologies, 3D structures, functions and interactions. Here we report a method for the high-throughput determination of extracellular segments of transmembrane proteins based on the identification of surface labeled and biotin captured peptide fragments by LC/MS/MS. We show that reliable identification of extracellular protein segments increases the accuracy and reliability of existing topology prediction algorithms. Using the experimental topology data as constraints, our improved prediction tool provides accurate and reliable topology models for hundreds of human transmembrane proteins.

No MeSH data available.