Limits...
SELECTpro: effective protein model selection using a structure-based energy function resistant to BLUNDERs.

Randall A, Baldi P - BMC Struct. Biol. (2008)

Bottom Line: The purpose of this work was to develop a structure-based model selection method based on predicted structural features that could be applied successfully to any set of models.Novel and unique energy terms include predicted secondary structure, predicted solvent accessibility, predicted contact map, beta-strand pairing, and side-chain hydrogen bonding.SELECTpro participated in the new model quality assessment (QA) category in CASP7, submitting predictions for all 95 targets and achieved top results.SELECTpro is an effective model selection method that scores models independently and is appropriate for use on any model set.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Information and Computer Sciences, University of California, Irvine, CA 92697, USA. arandall@ics.uci.edu

ABSTRACT

Background: Protein tertiary structure prediction is a fundamental problem in computational biology and identifying the most native-like model from a set of predicted models is a key sub-problem. Consensus methods work well when the redundant models in the set are the most native-like, but fail when the most native-like model is unique. In contrast, structure-based methods score models independently and can be applied to model sets of any size and redundancy level. Additionally, structure-based methods have a variety of important applications including analogous fold recognition, refinement of sequence-structure alignments, and de novo prediction. The purpose of this work was to develop a structure-based model selection method based on predicted structural features that could be applied successfully to any set of models.

Results: Here we introduce SELECTpro, a novel structure-based model selection method derived from an energy function comprising physical, statistical, and predicted structural terms. Novel and unique energy terms include predicted secondary structure, predicted solvent accessibility, predicted contact map, beta-strand pairing, and side-chain hydrogen bonding.SELECTpro participated in the new model quality assessment (QA) category in CASP7, submitting predictions for all 95 targets and achieved top results. The average difference in GDT-TS between models ranked first by SELECTpro and the most native-like model was 5.07. This GDT-TS difference was less than 1% of the GDT-TS of the most native-like model for 18 targets, and less than 10% for 66 targets. SELECTpro also ranked the single most native-like first for 15 targets, in the top five for 39 targets, and in the top ten for 53 targets, more often than any other method. Because the ranking metric is skewed by model redundancy and ignores poor models with a better ranking than the most native-like model, the BLUNDER metric is introduced to overcome these limitations. SELECTpro is also evaluated on a recent benchmark set of 16 small proteins with large decoy sets of 12500 to 20000 models for each protein, where it outperforms the benchmarked method (I-TASSER).

Conclusion: SELECTpro is an effective model selection method that scores models independently and is appropriate for use on any model set. SELECTpro is available for download as a stand alone application at: http://www.igb.uci.edu/~baldig/selectpro.html. SELECTpro is also available as a public server at the same site.

Show MeSH
Reranking models from top servers. Each server predictor submitted five models per target, with the highest confidence model ranked first. (A) the number of targets where each server's highest GDT-TS model is ranked first is shown with gray bars, and black bars when the models are reranked with SELECTpro. (B) shows the change in average GDT-TS for each group when SELECTpro is used to select model 1. P-values of paired t-tests are shown above the horizontal axis when SELECTpro demonstrates improved model selection and statistically significant improvements (p < .05) are in bold.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2667183&req=5

Figure 2: Reranking models from top servers. Each server predictor submitted five models per target, with the highest confidence model ranked first. (A) the number of targets where each server's highest GDT-TS model is ranked first is shown with gray bars, and black bars when the models are reranked with SELECTpro. (B) shows the change in average GDT-TS for each group when SELECTpro is used to select model 1. P-values of paired t-tests are shown above the horizontal axis when SELECTpro demonstrates improved model selection and statistically significant improvements (p < .05) are in bold.

Mentions: The utility of SELECTpro for selecting the best model from a small set is demonstrated by selecting from the five models submitted for each target by the top automated predictors. These small set selection results are calculated using SetAll (Figure 2). SELECTpro is also evaluated on a recent benchmark set of 16 small proteins with large decoy sets of 12500 to 20000 models for each protein and compared to I-TASSER (Figure 3).


SELECTpro: effective protein model selection using a structure-based energy function resistant to BLUNDERs.

Randall A, Baldi P - BMC Struct. Biol. (2008)

Reranking models from top servers. Each server predictor submitted five models per target, with the highest confidence model ranked first. (A) the number of targets where each server's highest GDT-TS model is ranked first is shown with gray bars, and black bars when the models are reranked with SELECTpro. (B) shows the change in average GDT-TS for each group when SELECTpro is used to select model 1. P-values of paired t-tests are shown above the horizontal axis when SELECTpro demonstrates improved model selection and statistically significant improvements (p < .05) are in bold.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2667183&req=5

Figure 2: Reranking models from top servers. Each server predictor submitted five models per target, with the highest confidence model ranked first. (A) the number of targets where each server's highest GDT-TS model is ranked first is shown with gray bars, and black bars when the models are reranked with SELECTpro. (B) shows the change in average GDT-TS for each group when SELECTpro is used to select model 1. P-values of paired t-tests are shown above the horizontal axis when SELECTpro demonstrates improved model selection and statistically significant improvements (p < .05) are in bold.
Mentions: The utility of SELECTpro for selecting the best model from a small set is demonstrated by selecting from the five models submitted for each target by the top automated predictors. These small set selection results are calculated using SetAll (Figure 2). SELECTpro is also evaluated on a recent benchmark set of 16 small proteins with large decoy sets of 12500 to 20000 models for each protein and compared to I-TASSER (Figure 3).

Bottom Line: The purpose of this work was to develop a structure-based model selection method based on predicted structural features that could be applied successfully to any set of models.Novel and unique energy terms include predicted secondary structure, predicted solvent accessibility, predicted contact map, beta-strand pairing, and side-chain hydrogen bonding.SELECTpro participated in the new model quality assessment (QA) category in CASP7, submitting predictions for all 95 targets and achieved top results.SELECTpro is an effective model selection method that scores models independently and is appropriate for use on any model set.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Information and Computer Sciences, University of California, Irvine, CA 92697, USA. arandall@ics.uci.edu

ABSTRACT

Background: Protein tertiary structure prediction is a fundamental problem in computational biology and identifying the most native-like model from a set of predicted models is a key sub-problem. Consensus methods work well when the redundant models in the set are the most native-like, but fail when the most native-like model is unique. In contrast, structure-based methods score models independently and can be applied to model sets of any size and redundancy level. Additionally, structure-based methods have a variety of important applications including analogous fold recognition, refinement of sequence-structure alignments, and de novo prediction. The purpose of this work was to develop a structure-based model selection method based on predicted structural features that could be applied successfully to any set of models.

Results: Here we introduce SELECTpro, a novel structure-based model selection method derived from an energy function comprising physical, statistical, and predicted structural terms. Novel and unique energy terms include predicted secondary structure, predicted solvent accessibility, predicted contact map, beta-strand pairing, and side-chain hydrogen bonding.SELECTpro participated in the new model quality assessment (QA) category in CASP7, submitting predictions for all 95 targets and achieved top results. The average difference in GDT-TS between models ranked first by SELECTpro and the most native-like model was 5.07. This GDT-TS difference was less than 1% of the GDT-TS of the most native-like model for 18 targets, and less than 10% for 66 targets. SELECTpro also ranked the single most native-like first for 15 targets, in the top five for 39 targets, and in the top ten for 53 targets, more often than any other method. Because the ranking metric is skewed by model redundancy and ignores poor models with a better ranking than the most native-like model, the BLUNDER metric is introduced to overcome these limitations. SELECTpro is also evaluated on a recent benchmark set of 16 small proteins with large decoy sets of 12500 to 20000 models for each protein, where it outperforms the benchmarked method (I-TASSER).

Conclusion: SELECTpro is an effective model selection method that scores models independently and is appropriate for use on any model set. SELECTpro is available for download as a stand alone application at: http://www.igb.uci.edu/~baldig/selectpro.html. SELECTpro is also available as a public server at the same site.

Show MeSH