Limits...
Extensive protein and DNA backbone sampling improves structure-based specificity prediction for C2H2 zinc fingers.

Yanover C, Bradley P - Nucleic Acids Res. (2011)

Bottom Line: Sequence-specific DNA recognition by gene regulatory proteins is critical for proper cellular functioning.Here, we present a novel molecular modeling protocol for protein-DNA interfaces that borrows conformational sampling techniques from de novo protein structure prediction to generate a diverse ensemble of structural models from small fragments of related and unrelated protein-DNA complexes.The extensive conformational sampling is coupled with sequence space exploration so that binding preferences for the target protein can be inferred from the resulting optimized DNA sequences.

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024, USA.

ABSTRACT
Sequence-specific DNA recognition by gene regulatory proteins is critical for proper cellular functioning. The ability to predict the DNA binding preferences of these regulatory proteins from their amino acid sequence would greatly aid in reconstruction of their regulatory interactions. Structural modeling provides one route to such predictions: by building accurate molecular models of regulatory proteins in complex with candidate binding sites, and estimating their relative binding affinities for these sites using a suitable potential function, it should be possible to construct DNA binding profiles. Here, we present a novel molecular modeling protocol for protein-DNA interfaces that borrows conformational sampling techniques from de novo protein structure prediction to generate a diverse ensemble of structural models from small fragments of related and unrelated protein-DNA complexes. The extensive conformational sampling is coupled with sequence space exploration so that binding preferences for the target protein can be inferred from the resulting optimized DNA sequences. We apply the algorithm to predict binding profiles for a benchmark set of eleven C2H2 zinc finger transcription factors, five of known and six of unknown structure. The predicted profiles are in good agreement with experimental binding data; furthermore, examination of the modeled structures gives insight into observed binding preferences.

Show MeSH
Fragment assembly of unbound DNA structures. (A) Scatter plots of base RMSD to native (excluding terminal base pairs) versus all-atom energy for models built by double-helical fragment assembly followed by all-atom refinement. One energy unit is quivalent to ∼1.3 k cal/mol. (B) Superposition of the 1d49 crystal structure model (in stick representation, carbon atoms are purple) and the 25 lowest-energy fragment assembly models (in wireframe, with gray carbon atoms).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3113574&req=5

Figure 5: Fragment assembly of unbound DNA structures. (A) Scatter plots of base RMSD to native (excluding terminal base pairs) versus all-atom energy for models built by double-helical fragment assembly followed by all-atom refinement. One energy unit is quivalent to ∼1.3 k cal/mol. (B) Superposition of the 1d49 crystal structure model (in stick representation, carbon atoms are purple) and the 25 lowest-energy fragment assembly models (in wireframe, with gray carbon atoms).

Mentions: We first asked whether the DNA fragment assembly protocol is able to generate acceptable models of unbound DNA duplexes. Recall that the double-helical fragments that make up our DNA fragment library are taken from crystal structures of protein–DNA complexes, in which the DNA is often deformed by interactions with the protein. We selected two high-resolution unbound DNA crystal structures [1d49 (52) and 7bna (53)], containing 10 and 12 base pairs, respectively. For each target we chose double-helical fragments from our library based on sequence similarity to the DNA sequence just as in the bound simulations. We then generated 1000 all-atom models by low-resolution fragment assembly followed by high-resolution refinement. The results are depicted in Figure 5: the final models are similar to the corresponding crystal structures, as judged by RMSD (Figure 5A) and by visual inspection of the low-energy models (Figure 5B). These similarity values are within the range seen in molecular dynamics simulations of unbound DNA duplexes (54) (note that these fragment-rebuilding simulations have no input knowledge of the native structure).Figure 5.


Extensive protein and DNA backbone sampling improves structure-based specificity prediction for C2H2 zinc fingers.

Yanover C, Bradley P - Nucleic Acids Res. (2011)

Fragment assembly of unbound DNA structures. (A) Scatter plots of base RMSD to native (excluding terminal base pairs) versus all-atom energy for models built by double-helical fragment assembly followed by all-atom refinement. One energy unit is quivalent to ∼1.3 k cal/mol. (B) Superposition of the 1d49 crystal structure model (in stick representation, carbon atoms are purple) and the 25 lowest-energy fragment assembly models (in wireframe, with gray carbon atoms).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3113574&req=5

Figure 5: Fragment assembly of unbound DNA structures. (A) Scatter plots of base RMSD to native (excluding terminal base pairs) versus all-atom energy for models built by double-helical fragment assembly followed by all-atom refinement. One energy unit is quivalent to ∼1.3 k cal/mol. (B) Superposition of the 1d49 crystal structure model (in stick representation, carbon atoms are purple) and the 25 lowest-energy fragment assembly models (in wireframe, with gray carbon atoms).
Mentions: We first asked whether the DNA fragment assembly protocol is able to generate acceptable models of unbound DNA duplexes. Recall that the double-helical fragments that make up our DNA fragment library are taken from crystal structures of protein–DNA complexes, in which the DNA is often deformed by interactions with the protein. We selected two high-resolution unbound DNA crystal structures [1d49 (52) and 7bna (53)], containing 10 and 12 base pairs, respectively. For each target we chose double-helical fragments from our library based on sequence similarity to the DNA sequence just as in the bound simulations. We then generated 1000 all-atom models by low-resolution fragment assembly followed by high-resolution refinement. The results are depicted in Figure 5: the final models are similar to the corresponding crystal structures, as judged by RMSD (Figure 5A) and by visual inspection of the low-energy models (Figure 5B). These similarity values are within the range seen in molecular dynamics simulations of unbound DNA duplexes (54) (note that these fragment-rebuilding simulations have no input knowledge of the native structure).Figure 5.

Bottom Line: Sequence-specific DNA recognition by gene regulatory proteins is critical for proper cellular functioning.Here, we present a novel molecular modeling protocol for protein-DNA interfaces that borrows conformational sampling techniques from de novo protein structure prediction to generate a diverse ensemble of structural models from small fragments of related and unrelated protein-DNA complexes.The extensive conformational sampling is coupled with sequence space exploration so that binding preferences for the target protein can be inferred from the resulting optimized DNA sequences.

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024, USA.

ABSTRACT
Sequence-specific DNA recognition by gene regulatory proteins is critical for proper cellular functioning. The ability to predict the DNA binding preferences of these regulatory proteins from their amino acid sequence would greatly aid in reconstruction of their regulatory interactions. Structural modeling provides one route to such predictions: by building accurate molecular models of regulatory proteins in complex with candidate binding sites, and estimating their relative binding affinities for these sites using a suitable potential function, it should be possible to construct DNA binding profiles. Here, we present a novel molecular modeling protocol for protein-DNA interfaces that borrows conformational sampling techniques from de novo protein structure prediction to generate a diverse ensemble of structural models from small fragments of related and unrelated protein-DNA complexes. The extensive conformational sampling is coupled with sequence space exploration so that binding preferences for the target protein can be inferred from the resulting optimized DNA sequences. We apply the algorithm to predict binding profiles for a benchmark set of eleven C2H2 zinc finger transcription factors, five of known and six of unknown structure. The predicted profiles are in good agreement with experimental binding data; furthermore, examination of the modeled structures gives insight into observed binding preferences.

Show MeSH