Limits...
Improved model quality assessment using ProQ2.

Ray A, Lindahl E, Wallner B - BMC Bioinformatics (2012)

Bottom Line: Improved performance is obtained by combining previously used features with updated structural and predicted features.The most important contribution can be attributed to the use of profile weighting of the residue specific features and the use features averaged over the whole model even though the prediction is still local.The absolute quality assessment of the models at both local and global level is also improved.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Theoretical Physics & Swedish eScience Research Center, Royal Institute of Technology, Stockholm, Sweden.

ABSTRACT

Background: Employing methods to assess the quality of modeled protein structures is now standard practice in bioinformatics. In a broad sense, the techniques can be divided into methods relying on consensus prediction on the one hand, and single-model methods on the other. Consensus methods frequently perform very well when there is a clear consensus, but this is not always the case. In particular, they frequently fail in selecting the best possible model in the hard cases (lacking consensus) or in the easy cases where models are very similar. In contrast, single-model methods do not suffer from these drawbacks and could potentially be applied on any protein of interest to assess quality or as a scoring function for sampling-based refinement.

Results: Here, we present a new single-model method, ProQ2, based on ideas from its predecessor, ProQ. ProQ2 is a model quality assessment algorithm that uses support vector machines to predict local as well as global quality of protein models. Improved performance is obtained by combining previously used features with updated structural and predicted features. The most important contribution can be attributed to the use of profile weighting of the residue specific features and the use features averaged over the whole model even though the prediction is still local.

Conclusions: ProQ2 is significantly better than its predecessors at detecting high quality models, improving the sum of Z-scores for the selected first-ranked models by 20% and 32% compared to the second-best single-model method in CASP8 and CASP9, respectively. The absolute quality assessment of the models at both local and global level is also improved. The Pearson's correlation between the correct and local predicted score is improved from 0.59 to 0.70 on CASP8 and from 0.62 to 0.68 on CASP9; for global score to the correct GDT_TS from 0.75 to 0.80 and from 0.77 to 0.80 again compared to the second-best single methods in CASP8 and CASP9, respectively. ProQ2 is available at http://proq2.wallnerlab.org.

Show MeSH

Related in: MedlinePlus

Local quality prediction performance as measured by the average distance deviation for different fraction of top ranking residues for CASP8 (A) and CASP9 (B).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3584948&req=5

Figure 2: Local quality prediction performance as measured by the average distance deviation for different fraction of top ranking residues for CASP8 (A) and CASP9 (B).

Mentions: To get an idea of how good the top-ranking residues from the different methods are, we also calculated the average distance deviations from the true value for different fraction of top-ranking residues (Figure 2). This measure should ideally be as low as possible, but will gradually increase to the average deviation over the whole set. On CASP8 (Figure 2A), ProQ2 has a much lower average distance than ProQ and lower compared to QMEAN for the same level of top ranking residues. This is also maintained on the CASP9 data set, even though the distance to QMEAN is smaller and the new single-model method MetaMQAP performs between ProQ2 and QMEAN.


Improved model quality assessment using ProQ2.

Ray A, Lindahl E, Wallner B - BMC Bioinformatics (2012)

Local quality prediction performance as measured by the average distance deviation for different fraction of top ranking residues for CASP8 (A) and CASP9 (B).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3584948&req=5

Figure 2: Local quality prediction performance as measured by the average distance deviation for different fraction of top ranking residues for CASP8 (A) and CASP9 (B).
Mentions: To get an idea of how good the top-ranking residues from the different methods are, we also calculated the average distance deviations from the true value for different fraction of top-ranking residues (Figure 2). This measure should ideally be as low as possible, but will gradually increase to the average deviation over the whole set. On CASP8 (Figure 2A), ProQ2 has a much lower average distance than ProQ and lower compared to QMEAN for the same level of top ranking residues. This is also maintained on the CASP9 data set, even though the distance to QMEAN is smaller and the new single-model method MetaMQAP performs between ProQ2 and QMEAN.

Bottom Line: Improved performance is obtained by combining previously used features with updated structural and predicted features.The most important contribution can be attributed to the use of profile weighting of the residue specific features and the use features averaged over the whole model even though the prediction is still local.The absolute quality assessment of the models at both local and global level is also improved.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Theoretical Physics & Swedish eScience Research Center, Royal Institute of Technology, Stockholm, Sweden.

ABSTRACT

Background: Employing methods to assess the quality of modeled protein structures is now standard practice in bioinformatics. In a broad sense, the techniques can be divided into methods relying on consensus prediction on the one hand, and single-model methods on the other. Consensus methods frequently perform very well when there is a clear consensus, but this is not always the case. In particular, they frequently fail in selecting the best possible model in the hard cases (lacking consensus) or in the easy cases where models are very similar. In contrast, single-model methods do not suffer from these drawbacks and could potentially be applied on any protein of interest to assess quality or as a scoring function for sampling-based refinement.

Results: Here, we present a new single-model method, ProQ2, based on ideas from its predecessor, ProQ. ProQ2 is a model quality assessment algorithm that uses support vector machines to predict local as well as global quality of protein models. Improved performance is obtained by combining previously used features with updated structural and predicted features. The most important contribution can be attributed to the use of profile weighting of the residue specific features and the use features averaged over the whole model even though the prediction is still local.

Conclusions: ProQ2 is significantly better than its predecessors at detecting high quality models, improving the sum of Z-scores for the selected first-ranked models by 20% and 32% compared to the second-best single-model method in CASP8 and CASP9, respectively. The absolute quality assessment of the models at both local and global level is also improved. The Pearson's correlation between the correct and local predicted score is improved from 0.59 to 0.70 on CASP8 and from 0.62 to 0.68 on CASP9; for global score to the correct GDT_TS from 0.75 to 0.80 and from 0.77 to 0.80 again compared to the second-best single methods in CASP8 and CASP9, respectively. ProQ2 is available at http://proq2.wallnerlab.org.

Show MeSH
Related in: MedlinePlus