Limits...
A resource for benchmarking the usefulness of protein structure models.

Carbajo D, Tramontano A - BMC Bioinformatics (2012)

Bottom Line: The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest.The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively.Any restrictions to use by non-academics: No.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Physics, Sapienza University of Rome, P,le A, Moro, 5, 00185 Rome, Italy.

ABSTRACT

Background: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application.

Results: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively.

Conclusions: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.

Show MeSH
Mean (left) and maximum (right) average Euclidean distance differences for residues of catalytic sites annotated in CSA[21,22]for models with different values of GDT-TS.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3473236&req=5

Figure 6: Mean (left) and maximum (right) average Euclidean distance differences for residues of catalytic sites annotated in CSA[21,22]for models with different values of GDT-TS.

Mentions: We measured the Euclidean distance differences between every permutation of catalytic residue Cαs constituting an active site (with two or more residues) in the native structure and in its models of varying quality. Averaging all the differences over each active site showed an increasing mean Euclidean distance difference as model quality decreases in terms of GDT-TS (Figure 6). However, the maximum mean value of the difference per site (when model quality is the lowest) is never much higher than 0.5 Å, implying that the catalytic residues relative positions can be effectively estimated also in models of relatively low quality (Figure 6B).


A resource for benchmarking the usefulness of protein structure models.

Carbajo D, Tramontano A - BMC Bioinformatics (2012)

Mean (left) and maximum (right) average Euclidean distance differences for residues of catalytic sites annotated in CSA[21,22]for models with different values of GDT-TS.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3473236&req=5

Figure 6: Mean (left) and maximum (right) average Euclidean distance differences for residues of catalytic sites annotated in CSA[21,22]for models with different values of GDT-TS.
Mentions: We measured the Euclidean distance differences between every permutation of catalytic residue Cαs constituting an active site (with two or more residues) in the native structure and in its models of varying quality. Averaging all the differences over each active site showed an increasing mean Euclidean distance difference as model quality decreases in terms of GDT-TS (Figure 6). However, the maximum mean value of the difference per site (when model quality is the lowest) is never much higher than 0.5 Å, implying that the catalytic residues relative positions can be effectively estimated also in models of relatively low quality (Figure 6B).

Bottom Line: The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest.The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively.Any restrictions to use by non-academics: No.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Physics, Sapienza University of Rome, P,le A, Moro, 5, 00185 Rome, Italy.

ABSTRACT

Background: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application.

Results: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively.

Conclusions: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.

Show MeSH