Free kick instead of cross-validation in maximum-likelihood refinement of macromolecular crystal structures.
Bottom Line:
They utilize phase-error estimates that are calculated from a small fraction of diffraction data, called the test set, that are not used to fit the model.It is called ML free-kick refinement as it uses the ML formulation of the target function and is based on the idea of freeing the model from the model bias imposed by the chemical energy restraints used in refinement.This approach for the calculation of error estimates is superior to the cross-validation approach: it reduces the phase error and increases the accuracy of molecular models, is more robust, provides clearer maps and may use a smaller portion of data for the test set for the calculation of Rfree or may leave it out completely.
Affiliation: Department of Biochemistry and Molecular and Structural Biology, Institute Joǽef Stefan, Jamova 39, 1000 Ljubljana, Slovenia.
ABSTRACT
Show MeSH
The refinement of a molecular model is a computational procedure by which the atomic model is fitted to the diffraction data. The commonly used target in the refinement of macromolecular structures is the maximum-likelihood (ML) function, which relies on the assessment of model errors. The current ML functions rely on cross-validation. They utilize phase-error estimates that are calculated from a small fraction of diffraction data, called the test set, that are not used to fit the model. An approach has been developed that uses the work set to calculate the phase-error estimates in the ML refinement from simulating the model errors via the random displacement of atomic coordinates. It is called ML free-kick refinement as it uses the ML formulation of the target function and is based on the idea of freeing the model from the model bias imposed by the chemical energy restraints used in refinement. This approach for the calculation of error estimates is superior to the cross-validation approach: it reduces the phase error and increases the accuracy of molecular models, is more robust, provides clearer maps and may use a smaller portion of data for the test set for the calculation of Rfree or may leave it out completely. |
Related In:
Results -
Collection
License getmorefigures.php?uid=PMC4257616&req=5
Mentions: To analyze the robustness and convergence of the target functions in refinement, we chose five cases starting with molecular-replacement solutions. Analysis of the phase errors of the refined molecular-replacement models show that the phase errors and variability of structures refined with the ML FK approach are lower in all cases (Fig. 4 ▶). Fig. 4 ▶ also reveals the general trend of the ML FK function: the size of the work set negatively correlates with the phase error. This relationship is not evident for the ML CV approach, where a 10% size of the test set resulted in the lowest phase error in one instance (Fig. 4 ▶d). Concerning the distribution of the final phase errors, the small size of the test set, on which the scaling of the ML CV approach depends, evidently produces much variation. Comparison of Fig. 4 ▶ with Table 1 ▶ indicates that the spread of phase errors is larger with fewer data in the test set in the ML CV approach. This comparison also makes evident that the spreads of the phase errors of the largest test sets (10% of the data) of the ML CV cases are notably larger than for the ML FK cases. The narrowest spread of phase errors for the 2fy2 case with the largest test-set sizes also reflects the fact that in this case the starting molecular-replacement model was most similar in structure and sequence to the final structure. |
View Article: PubMed Central - HTML - PubMed
Affiliation: Department of Biochemistry and Molecular and Structural Biology, Institute Joǽef Stefan, Jamova 39, 1000 Ljubljana, Slovenia.