Limits...
ARP/wARP and molecular replacement: the next generation.

Cohen SX, Ben Jelloul M, Long F, Vagin A, Knipscheer P, Lebbink J, Sixma TK, Lamzin VS, Murshudov GN, Perrakis A - Acta Crystallogr. D Biol. Crystallogr. (2007)

Bottom Line: More than 100 molecular-replacement solutions automatically solved by the BALBES software were submitted to three standard protocols in flex-wARP and the results were compared with final models from the PDB.Standard metrics were gathered in a systematic way and enabled the drawing of statistical conclusions on the advantages of each protocol.This highlights the diversity of paths that the flex-wARP control system can employ to reach a nearly complete and accurate model while actually starting from the same initial information.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular Carcinogenesis, Netherlands Cancer Institute, Amsterdam, The Netherlands. s.cohen@nki.nl

ABSTRACT
Automatic iterative model (re-)building, as implemented in ARP/wARP and its new control system flex-wARP, is particularly well suited to follow structure solution by molecular replacement. More than 100 molecular-replacement solutions automatically solved by the BALBES software were submitted to three standard protocols in flex-wARP and the results were compared with final models from the PDB. Standard metrics were gathered in a systematic way and enabled the drawing of statistical conclusions on the advantages of each protocol. Based on this analysis, an empirical estimator was proposed that predicts how good the final model produced by flex-wARP is likely to be based on the experimental data and the quality of the molecular-replacement solution. To introduce the differences between the three flex-wARP protocols (keeping the complete search model, converting it to atomic coordinates but ignoring atom identities or using the electron-density map calculated from the molecular-replacement solution), two examples are also discussed in detail, focusing on the evolution of the models during iterative rebuilding. This highlights the diversity of paths that the flex-wARP control system can employ to reach a nearly complete and accurate model while actually starting from the same initial information.

Show MeSH
Box plot of the results of flex-wARP (running in default mode, keeping the initial model). The data sets were divided into five groups based either on the initial R factor (left column) or its high-resolution limit (right column). The boundaries of each group are labelled on the x axis. In each category the relative width of the box corresponds to the number of data sets in the category; the box itself spans vertically from the first to the third quartiles, whilst the bold line is situated at the median; whiskers represent the full spread of the distribution, whilst open circles represent outliers. The top two graphs represent the fraction of residues built (white boxes) and the fraction of residues assigned to sequence (hence having side chain built; grey boxes). The bottom two graphs give the values of the correlation of the map obtained by flex-wARP with the reference map.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2394809&req=5

fig4: Box plot of the results of flex-wARP (running in default mode, keeping the initial model). The data sets were divided into five groups based either on the initial R factor (left column) or its high-resolution limit (right column). The boundaries of each group are labelled on the x axis. In each category the relative width of the box corresponds to the number of data sets in the category; the box itself spans vertically from the first to the third quartiles, whilst the bold line is situated at the median; whiskers represent the full spread of the distribution, whilst open circles represent outliers. The top two graphs represent the fraction of residues built (white boxes) and the fraction of residues assigned to sequence (hence having side chain built; grey boxes). The bottom two graphs give the values of the correlation of the map obtained by flex-wARP with the reference map.

Mentions: The default protocol (§2.2; Fig. 4 ▶) shows that when the initial R work is better than 30% automatic model building is likely to produce useful results. Conversely, a molecular-replacement solution producing an R work of between 30 and 40% is almost equally likely to be rescued by the default flex-wARP procedure or fail to produce results of any use; however, there is a tendency to improve the map quality (as shown by the values of map correlation) but produce fairly incomplete models. When success is assessed as a function of resolution, the fundamental tendencies of ARP/wARP show up: when data better than 2.0 Å are available, ARP/wARP fails only occasionally (presumably when the starting model is really very bad). Between 2.0 and 2.5 Å models are in general less complete and more cases tend not to work, but in general the runs are successful. With data weaker than 2.5 Å there are occasional successes that produce models close to 80% completeness, while below 3.0 Å we did not observe a single successful case. These observations are well correlated with the general ARP/wARP success rates, but also show that ARP/wARP can often produce good model-building results even from data that do not extend beyond 2.5 Å.


ARP/wARP and molecular replacement: the next generation.

Cohen SX, Ben Jelloul M, Long F, Vagin A, Knipscheer P, Lebbink J, Sixma TK, Lamzin VS, Murshudov GN, Perrakis A - Acta Crystallogr. D Biol. Crystallogr. (2007)

Box plot of the results of flex-wARP (running in default mode, keeping the initial model). The data sets were divided into five groups based either on the initial R factor (left column) or its high-resolution limit (right column). The boundaries of each group are labelled on the x axis. In each category the relative width of the box corresponds to the number of data sets in the category; the box itself spans vertically from the first to the third quartiles, whilst the bold line is situated at the median; whiskers represent the full spread of the distribution, whilst open circles represent outliers. The top two graphs represent the fraction of residues built (white boxes) and the fraction of residues assigned to sequence (hence having side chain built; grey boxes). The bottom two graphs give the values of the correlation of the map obtained by flex-wARP with the reference map.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2394809&req=5

fig4: Box plot of the results of flex-wARP (running in default mode, keeping the initial model). The data sets were divided into five groups based either on the initial R factor (left column) or its high-resolution limit (right column). The boundaries of each group are labelled on the x axis. In each category the relative width of the box corresponds to the number of data sets in the category; the box itself spans vertically from the first to the third quartiles, whilst the bold line is situated at the median; whiskers represent the full spread of the distribution, whilst open circles represent outliers. The top two graphs represent the fraction of residues built (white boxes) and the fraction of residues assigned to sequence (hence having side chain built; grey boxes). The bottom two graphs give the values of the correlation of the map obtained by flex-wARP with the reference map.
Mentions: The default protocol (§2.2; Fig. 4 ▶) shows that when the initial R work is better than 30% automatic model building is likely to produce useful results. Conversely, a molecular-replacement solution producing an R work of between 30 and 40% is almost equally likely to be rescued by the default flex-wARP procedure or fail to produce results of any use; however, there is a tendency to improve the map quality (as shown by the values of map correlation) but produce fairly incomplete models. When success is assessed as a function of resolution, the fundamental tendencies of ARP/wARP show up: when data better than 2.0 Å are available, ARP/wARP fails only occasionally (presumably when the starting model is really very bad). Between 2.0 and 2.5 Å models are in general less complete and more cases tend not to work, but in general the runs are successful. With data weaker than 2.5 Å there are occasional successes that produce models close to 80% completeness, while below 3.0 Å we did not observe a single successful case. These observations are well correlated with the general ARP/wARP success rates, but also show that ARP/wARP can often produce good model-building results even from data that do not extend beyond 2.5 Å.

Bottom Line: More than 100 molecular-replacement solutions automatically solved by the BALBES software were submitted to three standard protocols in flex-wARP and the results were compared with final models from the PDB.Standard metrics were gathered in a systematic way and enabled the drawing of statistical conclusions on the advantages of each protocol.This highlights the diversity of paths that the flex-wARP control system can employ to reach a nearly complete and accurate model while actually starting from the same initial information.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular Carcinogenesis, Netherlands Cancer Institute, Amsterdam, The Netherlands. s.cohen@nki.nl

ABSTRACT
Automatic iterative model (re-)building, as implemented in ARP/wARP and its new control system flex-wARP, is particularly well suited to follow structure solution by molecular replacement. More than 100 molecular-replacement solutions automatically solved by the BALBES software were submitted to three standard protocols in flex-wARP and the results were compared with final models from the PDB. Standard metrics were gathered in a systematic way and enabled the drawing of statistical conclusions on the advantages of each protocol. Based on this analysis, an empirical estimator was proposed that predicts how good the final model produced by flex-wARP is likely to be based on the experimental data and the quality of the molecular-replacement solution. To introduce the differences between the three flex-wARP protocols (keeping the complete search model, converting it to atomic coordinates but ignoring atom identities or using the electron-density map calculated from the molecular-replacement solution), two examples are also discussed in detail, focusing on the evolution of the models during iterative rebuilding. This highlights the diversity of paths that the flex-wARP control system can employ to reach a nearly complete and accurate model while actually starting from the same initial information.

Show MeSH