Limits...
Quality assessment of protein model-structures using evolutionary conservation.

Kalman M, Ben-Tal N - Bioinformatics (2010)

Bottom Line: We present ConQuass, a novel quality assessment program based on the consistency between the model structure and the protein's conservation pattern.We show that it can identify problematic structural models, and that the scores it assigns to the server models in CASP8 correlate with the similarity of the models to the native structure.We also show that when the conservation information is reliable, the method's performance is comparable and complementary to that of the other single-structure quality assessment methods that participated in CASP8 and that do not use additional structural information from homologs.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel.

ABSTRACT

Motivation: Programs that evaluate the quality of a protein structural model are important both for validating the structure determination procedure and for guiding the model-building process. Such programs are based on properties of native structures that are generally not expected for faulty models. One such property, which is rarely used for automatic structure quality assessment, is the tendency for conserved residues to be located at the structural core and for variable residues to be located at the surface.

Results: We present ConQuass, a novel quality assessment program based on the consistency between the model structure and the protein's conservation pattern. We show that it can identify problematic structural models, and that the scores it assigns to the server models in CASP8 correlate with the similarity of the models to the native structure. We also show that when the conservation information is reliable, the method's performance is comparable and complementary to that of the other single-structure quality assessment methods that participated in CASP8 and that do not use additional structural information from homologs.

Availability: A perl implementation of the method, as well as the various perl and R scripts used for the analysis are available at http://bental.tau.ac.il/ConQuass/.

Contact: nirb@tauex.tau.ac.il

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH
Compatibility of the structure with the evolutionary profile of the protein is higher for higher-quality structures or higher-quality multiple sequence alignments, as described by different quality measures. (A) The mean ConQuass score of the proteins in the dataset when filtering only for the top X proteins (x-axis), as measured by several crystallographic structure quality measures: the R-factor, free R-factor and the resolution. (B) As in (A), but when filtering by non-structural measures: the number of residues (red), the number of homologous sequences in the alignment (black), the ratio of residues with insignificant conservation information as measured by ConSurf (green) and the number of homologous sequences in the alignment with at least 20% identity with the query (blue). Also shown is the optimal ratio achieved by integrating these four measures (gray).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC2865859&req=5

Figure 3: Compatibility of the structure with the evolutionary profile of the protein is higher for higher-quality structures or higher-quality multiple sequence alignments, as described by different quality measures. (A) The mean ConQuass score of the proteins in the dataset when filtering only for the top X proteins (x-axis), as measured by several crystallographic structure quality measures: the R-factor, free R-factor and the resolution. (B) As in (A), but when filtering by non-structural measures: the number of residues (red), the number of homologous sequences in the alignment (black), the ratio of residues with insignificant conservation information as measured by ConSurf (green) and the number of homologous sequences in the alignment with at least 20% identity with the query (blue). Also shown is the optimal ratio achieved by integrating these four measures (gray).

Mentions: The information in this dataset was used to calculate a propensity matrix, giving the compatibility of each conservation class with each accessibility class (Section 2.3, and Supplementary Table S1). The matrix confirmed our intuitive expectations, giving high propensity scores to accessible-variable residues and to buried conserved residues. Consequently, the matrix was used to calculate each protein structure's ConQuass score, which was the average of the propensity scores of the protein's residues. A score was calculated for each structure in the dataset (Fig. 2), using the biological unit complexes as given by PQS (Henrick and Thornton, 1998). Only 7.9% of the structures received a negative score, meaning that for most structures the residues' conservation levels tended to be compatible with their accessibility levels. However, when we determined scores for the individual chains in the dataset without the context of the biological unit, more structures were assigned a negative score (12.5%). This was due to monomers exposing conserved interface residues that are actually buried in the physiological complex. We also tried to determine scores for the biological unit complexes given by PISA (Krissinel and Henrick, 2007) and the results were very similar to those obtained for the PQS complexes (data not shown). The ConQuass scores also seemed to become progressively higher for structures of higher quality, as measured by various structure quality measures such as resolution, R-factor and free R-factor (Fig. 3A).Fig. 2.


Quality assessment of protein model-structures using evolutionary conservation.

Kalman M, Ben-Tal N - Bioinformatics (2010)

Compatibility of the structure with the evolutionary profile of the protein is higher for higher-quality structures or higher-quality multiple sequence alignments, as described by different quality measures. (A) The mean ConQuass score of the proteins in the dataset when filtering only for the top X proteins (x-axis), as measured by several crystallographic structure quality measures: the R-factor, free R-factor and the resolution. (B) As in (A), but when filtering by non-structural measures: the number of residues (red), the number of homologous sequences in the alignment (black), the ratio of residues with insignificant conservation information as measured by ConSurf (green) and the number of homologous sequences in the alignment with at least 20% identity with the query (blue). Also shown is the optimal ratio achieved by integrating these four measures (gray).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC2865859&req=5

Figure 3: Compatibility of the structure with the evolutionary profile of the protein is higher for higher-quality structures or higher-quality multiple sequence alignments, as described by different quality measures. (A) The mean ConQuass score of the proteins in the dataset when filtering only for the top X proteins (x-axis), as measured by several crystallographic structure quality measures: the R-factor, free R-factor and the resolution. (B) As in (A), but when filtering by non-structural measures: the number of residues (red), the number of homologous sequences in the alignment (black), the ratio of residues with insignificant conservation information as measured by ConSurf (green) and the number of homologous sequences in the alignment with at least 20% identity with the query (blue). Also shown is the optimal ratio achieved by integrating these four measures (gray).
Mentions: The information in this dataset was used to calculate a propensity matrix, giving the compatibility of each conservation class with each accessibility class (Section 2.3, and Supplementary Table S1). The matrix confirmed our intuitive expectations, giving high propensity scores to accessible-variable residues and to buried conserved residues. Consequently, the matrix was used to calculate each protein structure's ConQuass score, which was the average of the propensity scores of the protein's residues. A score was calculated for each structure in the dataset (Fig. 2), using the biological unit complexes as given by PQS (Henrick and Thornton, 1998). Only 7.9% of the structures received a negative score, meaning that for most structures the residues' conservation levels tended to be compatible with their accessibility levels. However, when we determined scores for the individual chains in the dataset without the context of the biological unit, more structures were assigned a negative score (12.5%). This was due to monomers exposing conserved interface residues that are actually buried in the physiological complex. We also tried to determine scores for the biological unit complexes given by PISA (Krissinel and Henrick, 2007) and the results were very similar to those obtained for the PQS complexes (data not shown). The ConQuass scores also seemed to become progressively higher for structures of higher quality, as measured by various structure quality measures such as resolution, R-factor and free R-factor (Fig. 3A).Fig. 2.

Bottom Line: We present ConQuass, a novel quality assessment program based on the consistency between the model structure and the protein's conservation pattern.We show that it can identify problematic structural models, and that the scores it assigns to the server models in CASP8 correlate with the similarity of the models to the native structure.We also show that when the conservation information is reliable, the method's performance is comparable and complementary to that of the other single-structure quality assessment methods that participated in CASP8 and that do not use additional structural information from homologs.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel.

ABSTRACT

Motivation: Programs that evaluate the quality of a protein structural model are important both for validating the structure determination procedure and for guiding the model-building process. Such programs are based on properties of native structures that are generally not expected for faulty models. One such property, which is rarely used for automatic structure quality assessment, is the tendency for conserved residues to be located at the structural core and for variable residues to be located at the surface.

Results: We present ConQuass, a novel quality assessment program based on the consistency between the model structure and the protein's conservation pattern. We show that it can identify problematic structural models, and that the scores it assigns to the server models in CASP8 correlate with the similarity of the models to the native structure. We also show that when the conservation information is reliable, the method's performance is comparable and complementary to that of the other single-structure quality assessment methods that participated in CASP8 and that do not use additional structural information from homologs.

Availability: A perl implementation of the method, as well as the various perl and R scripts used for the analysis are available at http://bental.tau.ac.il/ConQuass/.

Contact: nirb@tauex.tau.ac.il

Supplementary information: Supplementary data are available at Bioinformatics online.

Show MeSH