Limits...
Large-scale model quality assessment for improving protein tertiary structure prediction.

Cao R, Bhattacharya D, Adhikari B, Li J, Cheng J - Bioinformatics (2015)

Bottom Line: Our experiment demonstrates that the large-scale model QA approach is more consistent and robust in selecting models of better quality than any individual QA method.It was officially ranked third out of all 143 human and server predictors according to the total scores of the first models predicted for 78 CASP11 protein domains and second according to the total scores of the best of the five models predicted for these domains.MULTICOM's outstanding performance in the extremely competitive 2014 CASP11 experiment proves that our large-scale QA approach together with model clustering is a promising solution to one of the two major problems in protein structure modeling.

View Article: PubMed Central - PubMed

Affiliation: Computer Science Department, University of Missouri, Columbia, Missouri, 65211, USA, Informatics Institute, University of Missouri, Columbia, Missouri, 65211, USA and C. Bond Life Science Center, University of Missouri, Columbia, Missouri, 65211, USA.

Show MeSH
Tertiary structure prediction of domain 2 of T0783 (T0783-D2). (A) The superposition of the MULTICOM human TS1 model on domain 2 with the native structure. (B) The distribution of 191 models in the model pool. (C). The plot of the true GDT-TS scores of models against their predicted ranking
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4553833&req=5

btv235-F2: Tertiary structure prediction of domain 2 of T0783 (T0783-D2). (A) The superposition of the MULTICOM human TS1 model on domain 2 with the native structure. (B) The distribution of 191 models in the model pool. (C). The plot of the true GDT-TS scores of models against their predicted ranking

Mentions: In addition to assessing the overall performance, we specifically investigated two examples to illustrate how MULTICOM assessed the quality of the models of the following two targets. The first case is T0783-D2 (domain 2 of Target T0783). Figure 2A illustrates the distribution of the GDT-TS scores of the models of this domain, where most of the models actually have the true GDT-TS score less than 0.2 (i.e. very low quality), some models have the GDT-TS score around 0.4 (medium quality), and a few models have GDT-TS score 0.6 (relatively good quality). Figure 2B is the plot of true GDT-TS scores of these models against their ranking predicted by MULTICOM. It is shown that MULTICOM ranked the best model with the highest GDT-TS score (e.g. nns_TS1) as no. 1. In this case, all the individual single QA methods ranked this model within top five, but a pairwise method ranked it at no. 19. Combining these individual rankings, the consensus ranking predicted by MULTICOM was able to select this model to combine with other three similar models (nns_TS3, nns_TS2, and FFAS-3D_TS1) to generate a refined model as final prediction. Figure 2C is the superposition of this model with the native structure, which is an alpha-best-alpha protein. Our final model has a well-predicted four-strand beta-sheet in the middle and two well-positioned alpha helices in periphery. The final GDT-TS score of this model is 0.625.Fig. 2.


Large-scale model quality assessment for improving protein tertiary structure prediction.

Cao R, Bhattacharya D, Adhikari B, Li J, Cheng J - Bioinformatics (2015)

Tertiary structure prediction of domain 2 of T0783 (T0783-D2). (A) The superposition of the MULTICOM human TS1 model on domain 2 with the native structure. (B) The distribution of 191 models in the model pool. (C). The plot of the true GDT-TS scores of models against their predicted ranking
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4553833&req=5

btv235-F2: Tertiary structure prediction of domain 2 of T0783 (T0783-D2). (A) The superposition of the MULTICOM human TS1 model on domain 2 with the native structure. (B) The distribution of 191 models in the model pool. (C). The plot of the true GDT-TS scores of models against their predicted ranking
Mentions: In addition to assessing the overall performance, we specifically investigated two examples to illustrate how MULTICOM assessed the quality of the models of the following two targets. The first case is T0783-D2 (domain 2 of Target T0783). Figure 2A illustrates the distribution of the GDT-TS scores of the models of this domain, where most of the models actually have the true GDT-TS score less than 0.2 (i.e. very low quality), some models have the GDT-TS score around 0.4 (medium quality), and a few models have GDT-TS score 0.6 (relatively good quality). Figure 2B is the plot of true GDT-TS scores of these models against their ranking predicted by MULTICOM. It is shown that MULTICOM ranked the best model with the highest GDT-TS score (e.g. nns_TS1) as no. 1. In this case, all the individual single QA methods ranked this model within top five, but a pairwise method ranked it at no. 19. Combining these individual rankings, the consensus ranking predicted by MULTICOM was able to select this model to combine with other three similar models (nns_TS3, nns_TS2, and FFAS-3D_TS1) to generate a refined model as final prediction. Figure 2C is the superposition of this model with the native structure, which is an alpha-best-alpha protein. Our final model has a well-predicted four-strand beta-sheet in the middle and two well-positioned alpha helices in periphery. The final GDT-TS score of this model is 0.625.Fig. 2.

Bottom Line: Our experiment demonstrates that the large-scale model QA approach is more consistent and robust in selecting models of better quality than any individual QA method.It was officially ranked third out of all 143 human and server predictors according to the total scores of the first models predicted for 78 CASP11 protein domains and second according to the total scores of the best of the five models predicted for these domains.MULTICOM's outstanding performance in the extremely competitive 2014 CASP11 experiment proves that our large-scale QA approach together with model clustering is a promising solution to one of the two major problems in protein structure modeling.

View Article: PubMed Central - PubMed

Affiliation: Computer Science Department, University of Missouri, Columbia, Missouri, 65211, USA, Informatics Institute, University of Missouri, Columbia, Missouri, 65211, USA and C. Bond Life Science Center, University of Missouri, Columbia, Missouri, 65211, USA.

Show MeSH