Limits...
A resource for benchmarking the usefulness of protein structure models.

Carbajo D, Tramontano A - BMC Bioinformatics (2012)

Bottom Line: The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest.The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively.Any restrictions to use by non-academics: No.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Physics, Sapienza University of Rome, P,le A, Moro, 5, 00185 Rome, Italy.

ABSTRACT

Background: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application.

Results: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively.

Conclusions: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.

Show MeSH
Percentage of protein models with different GDT-TS values in which at least 75% of the exposed (left) and buried (right) residues can be correctly identified.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3473236&req=5

Figure 4: Percentage of protein models with different GDT-TS values in which at least 75% of the exposed (left) and buried (right) residues can be correctly identified.

Mentions: For example, models are often used to identify suitable locations for modifications or functionalization of the protein. Therefore one could ask to which extent the classification between exposed and buried residues can still be made using a model and which is the minimum level of model quality required to obtain meaningful results. We defined exposed residues as those with a solvent accessibility value above 70% (and buried ones those with a value below 30%) with respect to the maximum residue value, as defined by Miller et al. [28]. As shown in Figure 4, one can correctly identify 75% of the exposed residues in more than 40% of models with a GDT-TS above 90, and in almost 30% of those with a GDT-TS above 80. Below the latter threshold, the percentage of models where at least 75% of the exposed residues are correctly detected reaches 10%. This is relevant to keep in mind when using models as frameworks for experiments.


A resource for benchmarking the usefulness of protein structure models.

Carbajo D, Tramontano A - BMC Bioinformatics (2012)

Percentage of protein models with different GDT-TS values in which at least 75% of the exposed (left) and buried (right) residues can be correctly identified.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3473236&req=5

Figure 4: Percentage of protein models with different GDT-TS values in which at least 75% of the exposed (left) and buried (right) residues can be correctly identified.
Mentions: For example, models are often used to identify suitable locations for modifications or functionalization of the protein. Therefore one could ask to which extent the classification between exposed and buried residues can still be made using a model and which is the minimum level of model quality required to obtain meaningful results. We defined exposed residues as those with a solvent accessibility value above 70% (and buried ones those with a value below 30%) with respect to the maximum residue value, as defined by Miller et al. [28]. As shown in Figure 4, one can correctly identify 75% of the exposed residues in more than 40% of models with a GDT-TS above 90, and in almost 30% of those with a GDT-TS above 80. Below the latter threshold, the percentage of models where at least 75% of the exposed residues are correctly detected reaches 10%. This is relevant to keep in mind when using models as frameworks for experiments.

Bottom Line: The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest.The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively.Any restrictions to use by non-academics: No.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Physics, Sapienza University of Rome, P,le A, Moro, 5, 00185 Rome, Italy.

ABSTRACT

Background: Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application.

Results: This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively.

Conclusions: The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.

Show MeSH