Limits...
Quality assessment of protein model-structures based on structural and functional similarities.

Konopka BM, Nebel JC, Kotulska M - BMC Bioinformatics (2012)

Bottom Line: The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets.GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants.In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland.

ABSTRACT

Background: Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology.

Results: GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests.

Conclusions: The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

Show MeSH

Related in: MedlinePlus

Gain in “per target” correlations after introducing the outlier exclusion procedure. The procedure improved the performance of GOBA methods in all sets: CASP8 (gray), CASP9-in-contest (red) and CASP9 (green). The application of the procedure was most beneficial in the CASP9 set.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3526563&req=5

Figure 13: Gain in “per target” correlations after introducing the outlier exclusion procedure. The procedure improved the performance of GOBA methods in all sets: CASP8 (gray), CASP9-in-contest (red) and CASP9 (green). The application of the procedure was most beneficial in the CASP9 set.

Mentions: After CASP9-in-contest test an exclusion procedure was added as a post-processing step of GOBA (for details see Methods). The extended version of GOBA scores, i.e. yGA-family scores, are based on explicit values of Z-scores, which are not absolute measures of structural similarity. Therefore yGA scores should only be used to assess models that share a common structural neighborhood. In some cases of CASP targets this condition was not fulfilled: there were models so highly different that their associated SNs generated by DALI were unrelated. It resulted in exceptionally high differences in scores acquired by models with different structural neighborhoods (e.g. the models of T0628 from CASP9 that were mentioned). The exclusion procedure benefits from a “majority voting” approach. It excludes suspicious and outlier models from the evaluation (for example, the 10 models mentioned in the T0628 case) based on too high yGA-family scores. It was shown that the procedure improved the performance of GOBA quite significantly (Figure 13, Table 5). In some cases, however, it may exclude HQ model structures at a relatively poor background – majority of the ensemble is low quality models. Therefore each excluded model should be manually examined. Since this procedure was not tested during CASP9 contest, it is treated here as an addition to the basic methods.


Quality assessment of protein model-structures based on structural and functional similarities.

Konopka BM, Nebel JC, Kotulska M - BMC Bioinformatics (2012)

Gain in “per target” correlations after introducing the outlier exclusion procedure. The procedure improved the performance of GOBA methods in all sets: CASP8 (gray), CASP9-in-contest (red) and CASP9 (green). The application of the procedure was most beneficial in the CASP9 set.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3526563&req=5

Figure 13: Gain in “per target” correlations after introducing the outlier exclusion procedure. The procedure improved the performance of GOBA methods in all sets: CASP8 (gray), CASP9-in-contest (red) and CASP9 (green). The application of the procedure was most beneficial in the CASP9 set.
Mentions: After CASP9-in-contest test an exclusion procedure was added as a post-processing step of GOBA (for details see Methods). The extended version of GOBA scores, i.e. yGA-family scores, are based on explicit values of Z-scores, which are not absolute measures of structural similarity. Therefore yGA scores should only be used to assess models that share a common structural neighborhood. In some cases of CASP targets this condition was not fulfilled: there were models so highly different that their associated SNs generated by DALI were unrelated. It resulted in exceptionally high differences in scores acquired by models with different structural neighborhoods (e.g. the models of T0628 from CASP9 that were mentioned). The exclusion procedure benefits from a “majority voting” approach. It excludes suspicious and outlier models from the evaluation (for example, the 10 models mentioned in the T0628 case) based on too high yGA-family scores. It was shown that the procedure improved the performance of GOBA quite significantly (Figure 13, Table 5). In some cases, however, it may exclude HQ model structures at a relatively poor background – majority of the ensemble is low quality models. Therefore each excluded model should be manually examined. Since this procedure was not tested during CASP9 contest, it is treated here as an addition to the basic methods.

Bottom Line: The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets.GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants.In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland.

ABSTRACT

Background: Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology.

Results: GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests.

Conclusions: The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

Show MeSH
Related in: MedlinePlus