Limits...
Interpretation of ensembles created by multiple iterative rebuilding of macromolecular models.

Terwilliger TC, Grosse-Kunstleve RW, Afonine PV, Adams PD, Moriarty NW, Zwart P, Read RJ, Turk D, Hung LW - Acta Crystallogr. D Biol. Crystallogr. (2007)

Bottom Line: Most of the heterogeneity among models produced in this way is in the side chains and loops on the protein surface.Synthetic data were created in which a crystal structure was modelled as the average of a set of ;perfect' structures and the range of models obtained by rebuilding a single starting model was examined.Instead, the group of structures obtained by repetitive rebuilding reflects the precision of the models, and the standard deviation of coordinates of these structures is a lower bound estimate of the uncertainty in coordinates of the individual models.

View Article: PubMed Central - HTML - PubMed

Affiliation: Los Alamos National Laboratory, Mailstop M888, Los Alamos, NM 87545, USA. terwilliger@lanl.gov

ABSTRACT
Automation of iterative model building, density modification and refinement in macromolecular crystallography has made it feasible to carry out this entire process multiple times. By using different random seeds in the process, a number of different models compatible with experimental data can be created. Sets of models were generated in this way using real data for ten protein structures from the Protein Data Bank and using synthetic data generated at various resolutions. Most of the heterogeneity among models produced in this way is in the side chains and loops on the protein surface. Possible interpretations of the variation among models created by repetitive rebuilding were investigated. Synthetic data were created in which a crystal structure was modelled as the average of a set of ;perfect' structures and the range of models obtained by rebuilding a single starting model was examined. The standard deviations of coordinates in models obtained by repetitive rebuilding at high resolution are small, while those obtained for the same synthetic crystal structure at low resolution are large, so that the diversity within a group of models cannot generally be a quantitative reflection of the actual structures in a crystal. Instead, the group of structures obtained by repetitive rebuilding reflects the precision of the models, and the standard deviation of coordinates of these structures is a lower bound estimate of the uncertainty in coordinates of the individual models.

Show MeSH

Related in: MedlinePlus

PyMOL view (DeLano, 2002 ▶) of the overlay of 20 models of 1cqp obtained by repetitive model rebuilding, density modification and refinement.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2483474&req=5

fig2: PyMOL view (DeLano, 2002 ▶) of the overlay of 20 models of 1cqp obtained by repetitive model rebuilding, density modification and refinement.

Mentions: Fig. 1 ▶ illustrates the progress of rebuilding for one of the 20 models obtained for structure 1cqp at a resolution of 2.6 Å (Kallen et al., 1999 ▶). The model obtained after initial rebuilding of the 1cqp structure differs significantly from the starting model (0.47 Å for main chain, 1.49 Å for side chains), but subsequent iterations of rebuilding, including the recombination of five independently built models, reduces this difference to 0.21 Å for main-chain atoms and 0.91 Å for side-chain atoms. The starting free R factor in the first cycle of rebuilding was 0.42 and for the final rebuilt model it was 0.27; the corresponding value for the structure 1cqp from the PDB, re-refined with phenix.refine after removal of ligands and solvent, was 0.26. [The free R factor reported for this structure (Kallen et al., 1999 ▶) with all ligand and solvent was also 0.26]. The improvement in free R during the rebuilding process and the return of the structure towards the model in the PDB indicates that the rebuilding process generates diversity in the initial rebuilding of the model and then improves the agreement with the data during subsequent rounds of rebuilding, density modification and refinement. Fig. 2 ▶ illustrates the final 20 models obtained from rebuilding 1cqp. Most of the diversity among models is in the side chains and most of the heterogeneous side chains are on the surface of the protein. The SD of the coordinates of models is 0.12 Å for main-chain atoms and 0.53 Å for side-chain atoms. These models differ from the 1cqp model (after re-refinement with phenix.refine without waters or ligands; Kallen et al., 1999 ▶) by an r.m.s.d. of 0.18 Å for main-chain atoms and 0.93 Å for side-chain atoms. The maximum-likelihood estimate of overall coordinate uncertainty for the 1cqp model is 0.41 Å (Read, 1986 ▶; Lunin et al., 2002 ▶).


Interpretation of ensembles created by multiple iterative rebuilding of macromolecular models.

Terwilliger TC, Grosse-Kunstleve RW, Afonine PV, Adams PD, Moriarty NW, Zwart P, Read RJ, Turk D, Hung LW - Acta Crystallogr. D Biol. Crystallogr. (2007)

PyMOL view (DeLano, 2002 ▶) of the overlay of 20 models of 1cqp obtained by repetitive model rebuilding, density modification and refinement.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2483474&req=5

fig2: PyMOL view (DeLano, 2002 ▶) of the overlay of 20 models of 1cqp obtained by repetitive model rebuilding, density modification and refinement.
Mentions: Fig. 1 ▶ illustrates the progress of rebuilding for one of the 20 models obtained for structure 1cqp at a resolution of 2.6 Å (Kallen et al., 1999 ▶). The model obtained after initial rebuilding of the 1cqp structure differs significantly from the starting model (0.47 Å for main chain, 1.49 Å for side chains), but subsequent iterations of rebuilding, including the recombination of five independently built models, reduces this difference to 0.21 Å for main-chain atoms and 0.91 Å for side-chain atoms. The starting free R factor in the first cycle of rebuilding was 0.42 and for the final rebuilt model it was 0.27; the corresponding value for the structure 1cqp from the PDB, re-refined with phenix.refine after removal of ligands and solvent, was 0.26. [The free R factor reported for this structure (Kallen et al., 1999 ▶) with all ligand and solvent was also 0.26]. The improvement in free R during the rebuilding process and the return of the structure towards the model in the PDB indicates that the rebuilding process generates diversity in the initial rebuilding of the model and then improves the agreement with the data during subsequent rounds of rebuilding, density modification and refinement. Fig. 2 ▶ illustrates the final 20 models obtained from rebuilding 1cqp. Most of the diversity among models is in the side chains and most of the heterogeneous side chains are on the surface of the protein. The SD of the coordinates of models is 0.12 Å for main-chain atoms and 0.53 Å for side-chain atoms. These models differ from the 1cqp model (after re-refinement with phenix.refine without waters or ligands; Kallen et al., 1999 ▶) by an r.m.s.d. of 0.18 Å for main-chain atoms and 0.93 Å for side-chain atoms. The maximum-likelihood estimate of overall coordinate uncertainty for the 1cqp model is 0.41 Å (Read, 1986 ▶; Lunin et al., 2002 ▶).

Bottom Line: Most of the heterogeneity among models produced in this way is in the side chains and loops on the protein surface.Synthetic data were created in which a crystal structure was modelled as the average of a set of ;perfect' structures and the range of models obtained by rebuilding a single starting model was examined.Instead, the group of structures obtained by repetitive rebuilding reflects the precision of the models, and the standard deviation of coordinates of these structures is a lower bound estimate of the uncertainty in coordinates of the individual models.

View Article: PubMed Central - HTML - PubMed

Affiliation: Los Alamos National Laboratory, Mailstop M888, Los Alamos, NM 87545, USA. terwilliger@lanl.gov

ABSTRACT
Automation of iterative model building, density modification and refinement in macromolecular crystallography has made it feasible to carry out this entire process multiple times. By using different random seeds in the process, a number of different models compatible with experimental data can be created. Sets of models were generated in this way using real data for ten protein structures from the Protein Data Bank and using synthetic data generated at various resolutions. Most of the heterogeneity among models produced in this way is in the side chains and loops on the protein surface. Possible interpretations of the variation among models created by repetitive rebuilding were investigated. Synthetic data were created in which a crystal structure was modelled as the average of a set of ;perfect' structures and the range of models obtained by rebuilding a single starting model was examined. The standard deviations of coordinates in models obtained by repetitive rebuilding at high resolution are small, while those obtained for the same synthetic crystal structure at low resolution are large, so that the diversity within a group of models cannot generally be a quantitative reflection of the actual structures in a crystal. Instead, the group of structures obtained by repetitive rebuilding reflects the precision of the models, and the standard deviation of coordinates of these structures is a lower bound estimate of the uncertainty in coordinates of the individual models.

Show MeSH
Related in: MedlinePlus