Limits...
Quality assessment of protein model-structures based on structural and functional similarities.

Konopka BM, Nebel JC, Kotulska M - BMC Bioinformatics (2012)

Bottom Line: The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets.GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants.In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland.

ABSTRACT

Background: Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology.

Results: GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests.

Conclusions: The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

Show MeSH

Related in: MedlinePlus

The example of T0429 presents a case where GOBA is superior over all other methods in CASP8. Assessment of models submitted by ModFOLDclust. The models formed two structural clusters, which can be observed as two distinct groups (encircled) on the ModFOLDclust vs GDT_TS graph. All points are colored using a blue-red color scale, which is based on GOBA yGA579 scores. Blue and red colors are assigned, respectively, to worst and best models in terms of yGA579. In this case the consensus method erroneously treated the two clusters as equivalent, while GOBA correctly assigned lowest scores to models from the first cluster and higher scores to models from the second one.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3526563&req=5

Figure 10: The example of T0429 presents a case where GOBA is superior over all other methods in CASP8. Assessment of models submitted by ModFOLDclust. The models formed two structural clusters, which can be observed as two distinct groups (encircled) on the ModFOLDclust vs GDT_TS graph. All points are colored using a blue-red color scale, which is based on GOBA yGA579 scores. Blue and red colors are assigned, respectively, to worst and best models in terms of yGA579. In this case the consensus method erroneously treated the two clusters as equivalent, while GOBA correctly assigned lowest scores to models from the first cluster and higher scores to models from the second one.

Mentions: In terms of the average “per target” correlation, yGA579 was the best performing GOBA metric. In case of T0429-D1, T0504-D3 CASP8 targets and T0575 CASP9 target, GOBA was the best among all compared methods. In addition, yGA579 was also one of the top-performing methods for many others, e.g.T0504-D1, T0501-D1 (CASP8) or T0563 (CASP9) (see Additional file 4 and Additional file 5). The example of T0504 domains shows that our method can perform as well as the best state-of-the-art approaches, whilst the case of T0429-D1 reveals that GOBA can have an advantage over these methods. This is illustrated by the two large clusters formed by the models of T0429 submitted to the CASP8 contest (Figure 10). While clustering methods estimated the quality of models from the two clusters as equal, GOBA could distinguish between them and provide better quality estimates.


Quality assessment of protein model-structures based on structural and functional similarities.

Konopka BM, Nebel JC, Kotulska M - BMC Bioinformatics (2012)

The example of T0429 presents a case where GOBA is superior over all other methods in CASP8. Assessment of models submitted by ModFOLDclust. The models formed two structural clusters, which can be observed as two distinct groups (encircled) on the ModFOLDclust vs GDT_TS graph. All points are colored using a blue-red color scale, which is based on GOBA yGA579 scores. Blue and red colors are assigned, respectively, to worst and best models in terms of yGA579. In this case the consensus method erroneously treated the two clusters as equivalent, while GOBA correctly assigned lowest scores to models from the first cluster and higher scores to models from the second one.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3526563&req=5

Figure 10: The example of T0429 presents a case where GOBA is superior over all other methods in CASP8. Assessment of models submitted by ModFOLDclust. The models formed two structural clusters, which can be observed as two distinct groups (encircled) on the ModFOLDclust vs GDT_TS graph. All points are colored using a blue-red color scale, which is based on GOBA yGA579 scores. Blue and red colors are assigned, respectively, to worst and best models in terms of yGA579. In this case the consensus method erroneously treated the two clusters as equivalent, while GOBA correctly assigned lowest scores to models from the first cluster and higher scores to models from the second one.
Mentions: In terms of the average “per target” correlation, yGA579 was the best performing GOBA metric. In case of T0429-D1, T0504-D3 CASP8 targets and T0575 CASP9 target, GOBA was the best among all compared methods. In addition, yGA579 was also one of the top-performing methods for many others, e.g.T0504-D1, T0501-D1 (CASP8) or T0563 (CASP9) (see Additional file 4 and Additional file 5). The example of T0504 domains shows that our method can perform as well as the best state-of-the-art approaches, whilst the case of T0429-D1 reveals that GOBA can have an advantage over these methods. This is illustrated by the two large clusters formed by the models of T0429 submitted to the CASP8 contest (Figure 10). While clustering methods estimated the quality of models from the two clusters as equal, GOBA could distinguish between them and provide better quality estimates.

Bottom Line: The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets.GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants.In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland.

ABSTRACT

Background: Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology.

Results: GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests.

Conclusions: The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

Show MeSH
Related in: MedlinePlus