Limits...
Structure-based prediction of C2H2 zinc-finger binding specificity: sensitivity to docking geometry.

Siggers TW, Honig B - Nucleic Acids Res. (2007)

Bottom Line: Using a molecular-mechanics force field, we predict high-affinity nucleotide sequences that bind to the second zinc-finger (ZF) domain from the Zif268 protein, using different C2H2 ZF domains as structural templates.We identify a strong relationship between IAS values and prediction accuracy, and define a range of IAS values for which accurate structure-based predictions of binding specificity is to be expected.The implication of our results for large-scale, structure-based prediction of PWMs is discussed.

View Article: PubMed Central - PubMed

Affiliation: Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, USA.

ABSTRACT
Predicting the binding specificity of transcription factors is a critical step in the characterization and computational identification and of cis-regulatory elements in genomic sequences. Here we use protein-DNA structures to predict binding specificity and consider the possibility of predicting position weight matrices (PWM) for an entire protein family based on the structures of just a few family members. A particular focus is the sensitivity of prediction accuracy to the docking geometry of the structure used. We investigate this issue with the goal of determining how similar two docking geometries must be for binding specificity predictions to be accurate. Docking similarity is quantified using our recently described interface alignment score (IAS). Using a molecular-mechanics force field, we predict high-affinity nucleotide sequences that bind to the second zinc-finger (ZF) domain from the Zif268 protein, using different C2H2 ZF domains as structural templates. We identify a strong relationship between IAS values and prediction accuracy, and define a range of IAS values for which accurate structure-based predictions of binding specificity is to be expected. The implication of our results for large-scale, structure-based prediction of PWMs is discussed.

Show MeSH
Native and predicted His(3) side-chain conformations. Side-chain conformations for the His residue at canonical ZF position 3 are shown from hrZif268 ZF2 (1aay His149; white), and from the complexes modeled with the TGG sequence (red) and TTG sequence (brown) using hrZif268 ZF2 as a template. DNA bases are shown in CPK coloring and correspond to the modeled TG (TGG) and TT (TTG) bases. Residue numbering as in Figure 1.
© Copyright Policy - openaccess
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1851644&req=5

Figure 4: Native and predicted His(3) side-chain conformations. Side-chain conformations for the His residue at canonical ZF position 3 are shown from hrZif268 ZF2 (1aay His149; white), and from the complexes modeled with the TGG sequence (red) and TTG sequence (brown) using hrZif268 ZF2 as a template. DNA bases are shown in CPK coloring and correspond to the modeled TG (TGG) and TT (TTG) bases. Residue numbering as in Figure 1.

Mentions: Table 3 lists the seven high-affinity binding sequences identified by Bulyk et al. and the ten highest-affinity predicted sequences using hrZif268 ZF2 as the template for itself. As can be seen from the table, the high-affinity TGG sequence is correctly predicted as the strongest binder, the TGG, TAG, and GGG sequences are correctly identified as the three highest-affinity sequences, and all seven of the highest-affinity experimentally determined binding sequences are present in the top-eight predicted sequences. Noteworthy is the correct prediction of the TTG sequence as a high-affinity sequence. In the complex modeled with the TTG sequence, the His(3) side-chain adopts a conformation considerably different from those predicted for all other sequences. His(3) rotates out of the major groove and forms a hydrogen bond with a DNA-backbone phosphate group (Figure 4). Predicting this alternate side-chain conformation is required to correctly identify the TTG sequence as a high-affinity site demonstrating both the importance of allowing side-chain flexibility in the modeling process and that a template structure can be used to effectively model bound complexes where the side-chain conformations are different than those in the original template. The striking agreement of the high-affinity predicted sequences with all seven high-affinity sequences demonstrates that our atomic-level modeling approach can yield highly accurate predictions given an appropriate template structure. Furthermore, it suggests that the docking geometry of the hrZif268 ZF2 template bound to the TGG sequence is a reasonable representation of the docking geometry when bound to the alternate high-affinity sequences.Figure 4.


Structure-based prediction of C2H2 zinc-finger binding specificity: sensitivity to docking geometry.

Siggers TW, Honig B - Nucleic Acids Res. (2007)

Native and predicted His(3) side-chain conformations. Side-chain conformations for the His residue at canonical ZF position 3 are shown from hrZif268 ZF2 (1aay His149; white), and from the complexes modeled with the TGG sequence (red) and TTG sequence (brown) using hrZif268 ZF2 as a template. DNA bases are shown in CPK coloring and correspond to the modeled TG (TGG) and TT (TTG) bases. Residue numbering as in Figure 1.
© Copyright Policy - openaccess
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1851644&req=5

Figure 4: Native and predicted His(3) side-chain conformations. Side-chain conformations for the His residue at canonical ZF position 3 are shown from hrZif268 ZF2 (1aay His149; white), and from the complexes modeled with the TGG sequence (red) and TTG sequence (brown) using hrZif268 ZF2 as a template. DNA bases are shown in CPK coloring and correspond to the modeled TG (TGG) and TT (TTG) bases. Residue numbering as in Figure 1.
Mentions: Table 3 lists the seven high-affinity binding sequences identified by Bulyk et al. and the ten highest-affinity predicted sequences using hrZif268 ZF2 as the template for itself. As can be seen from the table, the high-affinity TGG sequence is correctly predicted as the strongest binder, the TGG, TAG, and GGG sequences are correctly identified as the three highest-affinity sequences, and all seven of the highest-affinity experimentally determined binding sequences are present in the top-eight predicted sequences. Noteworthy is the correct prediction of the TTG sequence as a high-affinity sequence. In the complex modeled with the TTG sequence, the His(3) side-chain adopts a conformation considerably different from those predicted for all other sequences. His(3) rotates out of the major groove and forms a hydrogen bond with a DNA-backbone phosphate group (Figure 4). Predicting this alternate side-chain conformation is required to correctly identify the TTG sequence as a high-affinity site demonstrating both the importance of allowing side-chain flexibility in the modeling process and that a template structure can be used to effectively model bound complexes where the side-chain conformations are different than those in the original template. The striking agreement of the high-affinity predicted sequences with all seven high-affinity sequences demonstrates that our atomic-level modeling approach can yield highly accurate predictions given an appropriate template structure. Furthermore, it suggests that the docking geometry of the hrZif268 ZF2 template bound to the TGG sequence is a reasonable representation of the docking geometry when bound to the alternate high-affinity sequences.Figure 4.

Bottom Line: Using a molecular-mechanics force field, we predict high-affinity nucleotide sequences that bind to the second zinc-finger (ZF) domain from the Zif268 protein, using different C2H2 ZF domains as structural templates.We identify a strong relationship between IAS values and prediction accuracy, and define a range of IAS values for which accurate structure-based predictions of binding specificity is to be expected.The implication of our results for large-scale, structure-based prediction of PWMs is discussed.

View Article: PubMed Central - PubMed

Affiliation: Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Avenue, Room 815, New York, NY 10032, USA.

ABSTRACT
Predicting the binding specificity of transcription factors is a critical step in the characterization and computational identification and of cis-regulatory elements in genomic sequences. Here we use protein-DNA structures to predict binding specificity and consider the possibility of predicting position weight matrices (PWM) for an entire protein family based on the structures of just a few family members. A particular focus is the sensitivity of prediction accuracy to the docking geometry of the structure used. We investigate this issue with the goal of determining how similar two docking geometries must be for binding specificity predictions to be accurate. Docking similarity is quantified using our recently described interface alignment score (IAS). Using a molecular-mechanics force field, we predict high-affinity nucleotide sequences that bind to the second zinc-finger (ZF) domain from the Zif268 protein, using different C2H2 ZF domains as structural templates. We identify a strong relationship between IAS values and prediction accuracy, and define a range of IAS values for which accurate structure-based predictions of binding specificity is to be expected. The implication of our results for large-scale, structure-based prediction of PWMs is discussed.

Show MeSH