Limits...
Using genome-wide measurements for computational prediction of SH2-peptide interactions.

Wunderlich Z, Mirny LA - Nucleic Acids Res. (2009)

Bottom Line: We use this model to successfully predict which SH2 domains and peptides interact and uncover the positions in each that are important for specificity.The energy model is general enough that it can be applied to other members of the SH2 family or to new peptides, and the cross-validation results suggest that these energy calculations will be useful for predicting binding interactions.It can also be adapted to study other PRM families, predict optimal peptides for a given SH2 domain, or study other biological interactions, e.g. protein-DNA interactions.

View Article: PubMed Central - PubMed

Affiliation: Biophysics Program, Harvard University, Cambridge, MA 02138, USA.

ABSTRACT
Peptide-recognition modules (PRMs) are used throughout biology to mediate protein-protein interactions, and many PRMs are members of large protein domain families. Recent genome-wide measurements describe networks of peptide-PRM interactions. In these networks, very similar PRMs recognize distinct sets of peptides, raising the question of how peptide-recognition specificity is achieved using similar protein domains. The analysis of individual protein complex structures often gives answers that are not easily applicable to other members of the same PRM family. Bioinformatics-based approaches, one the other hand, may be difficult to interpret physically. Here we integrate structural information with a large, quantitative data set of SH2 domain-peptide interactions to study the physical origin of domain-peptide specificity. We develop an energy model, inspired by protein folding, based on interactions between the amino-acid positions in the domain and peptide. We use this model to successfully predict which SH2 domains and peptides interact and uncover the positions in each that are important for specificity. The energy model is general enough that it can be applied to other members of the SH2 family or to new peptides, and the cross-validation results suggest that these energy calculations will be useful for predicting binding interactions. It can also be adapted to study other PRM families, predict optimal peptides for a given SH2 domain, or study other biological interactions, e.g. protein-DNA interactions.

Show MeSH
Performance of the basic model on unseen domains and peptides. In order to assess the expected performance of the energy model on peptides or domains not used in the construction process, we perform LOGO cross-validation. To do this, we exclude all the pairs including a particular domain or peptide from the training set used to create the potential and then used the derived potential to predict the interaction energies of the excluded pairs. In (A), we plot the ROC AUC scores for excluded peptides and in (B) we plot the scores for excluded domains. In both cases, the average ROC AUC is very high, 0.84, indicating that the energy function transfers well to new domains and peptides.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2724268&req=5

Figure 3: Performance of the basic model on unseen domains and peptides. In order to assess the expected performance of the energy model on peptides or domains not used in the construction process, we perform LOGO cross-validation. To do this, we exclude all the pairs including a particular domain or peptide from the training set used to create the potential and then used the derived potential to predict the interaction energies of the excluded pairs. In (A), we plot the ROC AUC scores for excluded peptides and in (B) we plot the scores for excluded domains. In both cases, the average ROC AUC is very high, 0.84, indicating that the energy function transfers well to new domains and peptides.

Mentions: Using the optimal interaction map, we carry out LOGO cross-validation to see how the method will fare on unseen domains and peptides. The results are shown in Figure 3, and are quite impressive, with mean ROC AUC values of 0.84 for both unseen domains and peptides. For a number of domains and peptides, the ROC AUC is 1, indicating that the method perfectly separates bound and unbound domain–peptide pairs. (See Supplementary Table S1 for a list of domains and peptides with their ROC AUCs.) Interestingly, the performance of the model on an unseen domain seems to be uncorrelated to the sequence identity of the test domain to its nearest sequence neighbor in the training set when using the whole domain sequence or just the sequences at the contact positions, r = 0.19 and 0.30, respectively (Supplementary Figure S2).Figure 3.


Using genome-wide measurements for computational prediction of SH2-peptide interactions.

Wunderlich Z, Mirny LA - Nucleic Acids Res. (2009)

Performance of the basic model on unseen domains and peptides. In order to assess the expected performance of the energy model on peptides or domains not used in the construction process, we perform LOGO cross-validation. To do this, we exclude all the pairs including a particular domain or peptide from the training set used to create the potential and then used the derived potential to predict the interaction energies of the excluded pairs. In (A), we plot the ROC AUC scores for excluded peptides and in (B) we plot the scores for excluded domains. In both cases, the average ROC AUC is very high, 0.84, indicating that the energy function transfers well to new domains and peptides.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2724268&req=5

Figure 3: Performance of the basic model on unseen domains and peptides. In order to assess the expected performance of the energy model on peptides or domains not used in the construction process, we perform LOGO cross-validation. To do this, we exclude all the pairs including a particular domain or peptide from the training set used to create the potential and then used the derived potential to predict the interaction energies of the excluded pairs. In (A), we plot the ROC AUC scores for excluded peptides and in (B) we plot the scores for excluded domains. In both cases, the average ROC AUC is very high, 0.84, indicating that the energy function transfers well to new domains and peptides.
Mentions: Using the optimal interaction map, we carry out LOGO cross-validation to see how the method will fare on unseen domains and peptides. The results are shown in Figure 3, and are quite impressive, with mean ROC AUC values of 0.84 for both unseen domains and peptides. For a number of domains and peptides, the ROC AUC is 1, indicating that the method perfectly separates bound and unbound domain–peptide pairs. (See Supplementary Table S1 for a list of domains and peptides with their ROC AUCs.) Interestingly, the performance of the model on an unseen domain seems to be uncorrelated to the sequence identity of the test domain to its nearest sequence neighbor in the training set when using the whole domain sequence or just the sequences at the contact positions, r = 0.19 and 0.30, respectively (Supplementary Figure S2).Figure 3.

Bottom Line: We use this model to successfully predict which SH2 domains and peptides interact and uncover the positions in each that are important for specificity.The energy model is general enough that it can be applied to other members of the SH2 family or to new peptides, and the cross-validation results suggest that these energy calculations will be useful for predicting binding interactions.It can also be adapted to study other PRM families, predict optimal peptides for a given SH2 domain, or study other biological interactions, e.g. protein-DNA interactions.

View Article: PubMed Central - PubMed

Affiliation: Biophysics Program, Harvard University, Cambridge, MA 02138, USA.

ABSTRACT
Peptide-recognition modules (PRMs) are used throughout biology to mediate protein-protein interactions, and many PRMs are members of large protein domain families. Recent genome-wide measurements describe networks of peptide-PRM interactions. In these networks, very similar PRMs recognize distinct sets of peptides, raising the question of how peptide-recognition specificity is achieved using similar protein domains. The analysis of individual protein complex structures often gives answers that are not easily applicable to other members of the same PRM family. Bioinformatics-based approaches, one the other hand, may be difficult to interpret physically. Here we integrate structural information with a large, quantitative data set of SH2 domain-peptide interactions to study the physical origin of domain-peptide specificity. We develop an energy model, inspired by protein folding, based on interactions between the amino-acid positions in the domain and peptide. We use this model to successfully predict which SH2 domains and peptides interact and uncover the positions in each that are important for specificity. The energy model is general enough that it can be applied to other members of the SH2 family or to new peptides, and the cross-validation results suggest that these energy calculations will be useful for predicting binding interactions. It can also be adapted to study other PRM families, predict optimal peptides for a given SH2 domain, or study other biological interactions, e.g. protein-DNA interactions.

Show MeSH