Limits...
Amino acid empirical contact energy definitions for fold recognition in the space of contact maps.

Berrera M, Molinari H, Fogolari F - BMC Bioinformatics (2003)

Bottom Line: In 30 out of 35 cases the native structure is correctly recognized and best predictions are usually found among the 10 lowest energy predictions.An important prerequisite for the applicability of the approach is that the protein structure under study should not exhibit anomalous solvent accessibility, compared to soluble proteins whose structure is deposited in the Protein Data Bank.The combined evaluation of a solvent accessibility parameter and contact energy allows for an effective gross screening of predictive models.

View Article: PubMed Central - HTML - PubMed

Affiliation: International School for Advanced Studies Via Beirut 4, 34014 Trieste, Italy. berrera@sissa.it

ABSTRACT

Background: Contradicting evidence has been presented in the literature concerning the effectiveness of empirical contact energies for fold recognition. Empirical contact energies are calculated on the basis of information available from selected protein structures, with respect to a defined reference state, according to the quasi-chemical approximation. Protein-solvent interactions are estimated from residue solvent accessibility.

Results: In the approach presented here, contact energies are derived from the potential of mean force theory, several definitions of contact are examined and their performance in fold recognition is evaluated on sets of decoy structures. The best definition of contact is tested, on a more realistic scenario, on all predictions including sidechains accepted in the CASP4 experiment. In 30 out of 35 cases the native structure is correctly recognized and best predictions are usually found among the 10 lowest energy predictions.

Conclusion: The definition of contact based on van der Waals radii of alpha carbon and side chain heavy atoms is seen to perform better than other definitions involving only alpha carbons, only beta carbons, all heavy atoms or only backbone atoms. An important prerequisite for the applicability of the approach is that the protein structure under study should not exhibit anomalous solvent accessibility, compared to soluble proteins whose structure is deposited in the Protein Data Bank. The combined evaluation of a solvent accessibility parameter and contact energy allows for an effective gross screening of predictive models.

Show MeSH
Structure of target 124 in the CASP4 experiment The structure of the C-terminal domain of turkey phospolipase C beta is shown (pdb id. 1jad, chain A). The long helix forms a coiled coil with the other chain in the pdb file.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC153506&req=5

Figure 12: Structure of target 124 in the CASP4 experiment The structure of the C-terminal domain of turkey phospolipase C beta is shown (pdb id. 1jad, chain A). The long helix forms a coiled coil with the other chain in the pdb file.

Mentions: Problems with NMR structures versus X-ray structures have been repeatedly pointed out (see e. g. ref. [31]) so that failure in native structure recognition could also be due to artifacts in structure generation from NMR restraints. These few examples point out the complexity of predicting real proteins, where biological insight and additional informations about the function, the environment, ligands and multimeric state is of utmost importance. It should be noticed that only 15 targets had predictions (in the selection we did) with RMSD lower than 5 Å. For 12 out of these 15 targets a low RMSD prediction (less than 5 Å) was found among the ten lowest energy predictive models. For one of the remaining three targets (target 90) the chosen prediction has still rather low RMSD (6.125 Å). The other two targets (targets 120 and 124) where the method fails to recognize low RMSD predictions are dimers where only a monomer or part of it are modelled. For the sake of clarity the structure of the chains to be predicted (pdb id. 1fu1 and 1jad, respectively) are reported in Figures 11 and 12. Overall these results demonstrate the capability of contact energy (corresponding to the optimal contact definition) to recognize low RMSD predictions among the lowest energy models, provided that the structure to be modelled has the typical features of soluble globular proteins. The same results however point out that it is difficult to assess the reliability of the predictive models, because there is almost no correlation between the energy per residue and the proximity of the models to the native structure (when all pairs RMSD-energy are pooled together), at least for the models we selected from the CASP4 experiment.


Amino acid empirical contact energy definitions for fold recognition in the space of contact maps.

Berrera M, Molinari H, Fogolari F - BMC Bioinformatics (2003)

Structure of target 124 in the CASP4 experiment The structure of the C-terminal domain of turkey phospolipase C beta is shown (pdb id. 1jad, chain A). The long helix forms a coiled coil with the other chain in the pdb file.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC153506&req=5

Figure 12: Structure of target 124 in the CASP4 experiment The structure of the C-terminal domain of turkey phospolipase C beta is shown (pdb id. 1jad, chain A). The long helix forms a coiled coil with the other chain in the pdb file.
Mentions: Problems with NMR structures versus X-ray structures have been repeatedly pointed out (see e. g. ref. [31]) so that failure in native structure recognition could also be due to artifacts in structure generation from NMR restraints. These few examples point out the complexity of predicting real proteins, where biological insight and additional informations about the function, the environment, ligands and multimeric state is of utmost importance. It should be noticed that only 15 targets had predictions (in the selection we did) with RMSD lower than 5 Å. For 12 out of these 15 targets a low RMSD prediction (less than 5 Å) was found among the ten lowest energy predictive models. For one of the remaining three targets (target 90) the chosen prediction has still rather low RMSD (6.125 Å). The other two targets (targets 120 and 124) where the method fails to recognize low RMSD predictions are dimers where only a monomer or part of it are modelled. For the sake of clarity the structure of the chains to be predicted (pdb id. 1fu1 and 1jad, respectively) are reported in Figures 11 and 12. Overall these results demonstrate the capability of contact energy (corresponding to the optimal contact definition) to recognize low RMSD predictions among the lowest energy models, provided that the structure to be modelled has the typical features of soluble globular proteins. The same results however point out that it is difficult to assess the reliability of the predictive models, because there is almost no correlation between the energy per residue and the proximity of the models to the native structure (when all pairs RMSD-energy are pooled together), at least for the models we selected from the CASP4 experiment.

Bottom Line: In 30 out of 35 cases the native structure is correctly recognized and best predictions are usually found among the 10 lowest energy predictions.An important prerequisite for the applicability of the approach is that the protein structure under study should not exhibit anomalous solvent accessibility, compared to soluble proteins whose structure is deposited in the Protein Data Bank.The combined evaluation of a solvent accessibility parameter and contact energy allows for an effective gross screening of predictive models.

View Article: PubMed Central - HTML - PubMed

Affiliation: International School for Advanced Studies Via Beirut 4, 34014 Trieste, Italy. berrera@sissa.it

ABSTRACT

Background: Contradicting evidence has been presented in the literature concerning the effectiveness of empirical contact energies for fold recognition. Empirical contact energies are calculated on the basis of information available from selected protein structures, with respect to a defined reference state, according to the quasi-chemical approximation. Protein-solvent interactions are estimated from residue solvent accessibility.

Results: In the approach presented here, contact energies are derived from the potential of mean force theory, several definitions of contact are examined and their performance in fold recognition is evaluated on sets of decoy structures. The best definition of contact is tested, on a more realistic scenario, on all predictions including sidechains accepted in the CASP4 experiment. In 30 out of 35 cases the native structure is correctly recognized and best predictions are usually found among the 10 lowest energy predictions.

Conclusion: The definition of contact based on van der Waals radii of alpha carbon and side chain heavy atoms is seen to perform better than other definitions involving only alpha carbons, only beta carbons, all heavy atoms or only backbone atoms. An important prerequisite for the applicability of the approach is that the protein structure under study should not exhibit anomalous solvent accessibility, compared to soluble proteins whose structure is deposited in the Protein Data Bank. The combined evaluation of a solvent accessibility parameter and contact energy allows for an effective gross screening of predictive models.

Show MeSH