Limits...
Accurate prediction of DnaK-peptide binding via homology modelling and experimental data.

Van Durme J, Maurer-Stroh S, Gallardo R, Wilkinson H, Rousseau F, Schymkowitz J - PLoS Comput. Biol. (2009)

Bottom Line: We show that this combination significantly outperforms either single approach.To test the robustness of the learning set, we have conducted a simulated cross-validation, where we omit sequences from the learning sets and calculate the rate of repredicting them.This resulted in a surprisingly good MCC of 0.703.

View Article: PubMed Central - PubMed

Affiliation: VIB Switch Laboratory, Vrije Universiteit Brussel, Brussels, Belgium.

ABSTRACT
Molecular chaperones are essential elements of the protein quality control machinery that governs translocation and folding of nascent polypeptides, refolding and degradation of misfolded proteins, and activation of a wide range of client proteins. The prokaryotic heat-shock protein DnaK is the E. coli representative of the ubiquitous Hsp70 family, which specializes in the binding of exposed hydrophobic regions in unfolded polypeptides. Accurate prediction of DnaK binding sites in E. coli proteins is an essential prerequisite to understand the precise function of this chaperone and the properties of its substrate proteins. In order to map DnaK binding sites in protein sequences, we have developed an algorithm that combines sequence information from peptide binding experiments and structural parameters from homology modelling. We show that this combination significantly outperforms either single approach. The final predictor had a Matthews correlation coefficient (MCC) of 0.819 when assessed over the 144 tested peptide sequences to detect true positives and true negatives. To test the robustness of the learning set, we have conducted a simulated cross-validation, where we omit sequences from the learning sets and calculate the rate of repredicting them. This resulted in a surprisingly good MCC of 0.703. The algorithm was also able to perform equally well on a blind test set of binders and non-binders, of which there was no prior knowledge in the learning sets. The algorithm is freely available at http://limbo.vib.be.

Show MeSH
ROC curves as calculated from the PSSMs of different DnaK-substrate structures.The curves represent structures 1DKX (closed circle), 1DKY A (open square), 1DKY B (closed triangle) and 3 NMR structures from the same ensemble (open triangle, open circle, closed square).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2717214&req=5

pcbi-1000475-g003: ROC curves as calculated from the PSSMs of different DnaK-substrate structures.The curves represent structures 1DKX (closed circle), 1DKY A (open square), 1DKY B (closed triangle) and 3 NMR structures from the same ensemble (open triangle, open circle, closed square).

Mentions: Peptide backbone variation could influence the quality of the resulting PSSM. Therefore we assessed multiple backbone conformations of the entire structure by generating a ROC curve for PSSMs originating from different structures and calculating the MCC in the high specificity area. Structures of DnaK in complex with a substrate peptide were gathered from an NMR ensemble (PDB code 1Q5L) [15] and from different crystal structures (PDB codes 1DKX and 1DKY, of which the latter is a DnaK dimer with monomers A and B) [5]. The MCC of structure 1DKX was the highest as compared to 1DKY (A and B) and was above the MCCs of three randomly picked NMR structures from the ensemble 1Q5L (Table 1 and Figure 3). Moreover, the overall performance of the NMR structures was much lower than that of any X-ray structure, as shown in the ROC curves. Therefore we continued with the PSSM of the crystal structure 1DKX (See suppl. Table S2b for PSSM).


Accurate prediction of DnaK-peptide binding via homology modelling and experimental data.

Van Durme J, Maurer-Stroh S, Gallardo R, Wilkinson H, Rousseau F, Schymkowitz J - PLoS Comput. Biol. (2009)

ROC curves as calculated from the PSSMs of different DnaK-substrate structures.The curves represent structures 1DKX (closed circle), 1DKY A (open square), 1DKY B (closed triangle) and 3 NMR structures from the same ensemble (open triangle, open circle, closed square).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2717214&req=5

pcbi-1000475-g003: ROC curves as calculated from the PSSMs of different DnaK-substrate structures.The curves represent structures 1DKX (closed circle), 1DKY A (open square), 1DKY B (closed triangle) and 3 NMR structures from the same ensemble (open triangle, open circle, closed square).
Mentions: Peptide backbone variation could influence the quality of the resulting PSSM. Therefore we assessed multiple backbone conformations of the entire structure by generating a ROC curve for PSSMs originating from different structures and calculating the MCC in the high specificity area. Structures of DnaK in complex with a substrate peptide were gathered from an NMR ensemble (PDB code 1Q5L) [15] and from different crystal structures (PDB codes 1DKX and 1DKY, of which the latter is a DnaK dimer with monomers A and B) [5]. The MCC of structure 1DKX was the highest as compared to 1DKY (A and B) and was above the MCCs of three randomly picked NMR structures from the ensemble 1Q5L (Table 1 and Figure 3). Moreover, the overall performance of the NMR structures was much lower than that of any X-ray structure, as shown in the ROC curves. Therefore we continued with the PSSM of the crystal structure 1DKX (See suppl. Table S2b for PSSM).

Bottom Line: We show that this combination significantly outperforms either single approach.To test the robustness of the learning set, we have conducted a simulated cross-validation, where we omit sequences from the learning sets and calculate the rate of repredicting them.This resulted in a surprisingly good MCC of 0.703.

View Article: PubMed Central - PubMed

Affiliation: VIB Switch Laboratory, Vrije Universiteit Brussel, Brussels, Belgium.

ABSTRACT
Molecular chaperones are essential elements of the protein quality control machinery that governs translocation and folding of nascent polypeptides, refolding and degradation of misfolded proteins, and activation of a wide range of client proteins. The prokaryotic heat-shock protein DnaK is the E. coli representative of the ubiquitous Hsp70 family, which specializes in the binding of exposed hydrophobic regions in unfolded polypeptides. Accurate prediction of DnaK binding sites in E. coli proteins is an essential prerequisite to understand the precise function of this chaperone and the properties of its substrate proteins. In order to map DnaK binding sites in protein sequences, we have developed an algorithm that combines sequence information from peptide binding experiments and structural parameters from homology modelling. We show that this combination significantly outperforms either single approach. The final predictor had a Matthews correlation coefficient (MCC) of 0.819 when assessed over the 144 tested peptide sequences to detect true positives and true negatives. To test the robustness of the learning set, we have conducted a simulated cross-validation, where we omit sequences from the learning sets and calculate the rate of repredicting them. This resulted in a surprisingly good MCC of 0.703. The algorithm was also able to perform equally well on a blind test set of binders and non-binders, of which there was no prior knowledge in the learning sets. The algorithm is freely available at http://limbo.vib.be.

Show MeSH