Limits...
Extensive protein and DNA backbone sampling improves structure-based specificity prediction for C2H2 zinc fingers.

Yanover C, Bradley P - Nucleic Acids Res. (2011)

Bottom Line: Sequence-specific DNA recognition by gene regulatory proteins is critical for proper cellular functioning.Here, we present a novel molecular modeling protocol for protein-DNA interfaces that borrows conformational sampling techniques from de novo protein structure prediction to generate a diverse ensemble of structural models from small fragments of related and unrelated protein-DNA complexes.The extensive conformational sampling is coupled with sequence space exploration so that binding preferences for the target protein can be inferred from the resulting optimized DNA sequences.

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024, USA.

ABSTRACT
Sequence-specific DNA recognition by gene regulatory proteins is critical for proper cellular functioning. The ability to predict the DNA binding preferences of these regulatory proteins from their amino acid sequence would greatly aid in reconstruction of their regulatory interactions. Structural modeling provides one route to such predictions: by building accurate molecular models of regulatory proteins in complex with candidate binding sites, and estimating their relative binding affinities for these sites using a suitable potential function, it should be possible to construct DNA binding profiles. Here, we present a novel molecular modeling protocol for protein-DNA interfaces that borrows conformational sampling techniques from de novo protein structure prediction to generate a diverse ensemble of structural models from small fragments of related and unrelated protein-DNA complexes. The extensive conformational sampling is coupled with sequence space exploration so that binding preferences for the target protein can be inferred from the resulting optimized DNA sequences. We apply the algorithm to predict binding profiles for a benchmark set of eleven C2H2 zinc finger transcription factors, five of known and six of unknown structure. The predicted profiles are in good agreement with experimental binding data; furthermore, examination of the modeled structures gives insight into observed binding preferences.

Show MeSH
Structural basis of binding specificity: determinants of nucleotide preferences in the fragment assembly models are analyzed at 3 sites in the benchmark set. The base at the site of interest is colored yellow; interacting sidechains are colored green; additional interacting bases and sidechains shown in pink and purple. (A) Asn at helix position 3 (‘N3’) makes a bidentate hydrogen bond with A at position 2 (‘A2’) in the Ypr013c site. (B) Arg at position −1 can form a bidentate hydrogen bond with G at position 3 in the Mig1 site. (C) Correlation between helix positions breaks the simple logic of a ZF recognition code: Gln at helix position 6 forms a pair of hydrogen bonds with C at position 1 and with the T paired with an A at position 2, when Asn is also present at helix position 3. Gln at helix position 6 had been proposed to specify A rather than C at position 1 (28).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3113574&req=5

Figure 10: Structural basis of binding specificity: determinants of nucleotide preferences in the fragment assembly models are analyzed at 3 sites in the benchmark set. The base at the site of interest is colored yellow; interacting sidechains are colored green; additional interacting bases and sidechains shown in pink and purple. (A) Asn at helix position 3 (‘N3’) makes a bidentate hydrogen bond with A at position 2 (‘A2’) in the Ypr013c site. (B) Arg at position −1 can form a bidentate hydrogen bond with G at position 3 in the Mig1 site. (C) Correlation between helix positions breaks the simple logic of a ZF recognition code: Gln at helix position 6 forms a pair of hydrogen bonds with C at position 1 and with the T paired with an A at position 2, when Asn is also present at helix position 3. Gln at helix position 6 had been proposed to specify A rather than C at position 1 (28).

Mentions: Binding specificity predictions. For each benchmark protein, the experimental binding profile is shown above the structure-based specificity prediction. Experimental data sources are indicated in brackets. PFM columns are numbered so that columns 1-3 correspond to the last finger, columns 4-6 correspond to the second-to-last finger, and so on (see Figure 4). For the three boxed columns, structural determinants of binding preferences are illustrated in Figure 10.


Extensive protein and DNA backbone sampling improves structure-based specificity prediction for C2H2 zinc fingers.

Yanover C, Bradley P - Nucleic Acids Res. (2011)

Structural basis of binding specificity: determinants of nucleotide preferences in the fragment assembly models are analyzed at 3 sites in the benchmark set. The base at the site of interest is colored yellow; interacting sidechains are colored green; additional interacting bases and sidechains shown in pink and purple. (A) Asn at helix position 3 (‘N3’) makes a bidentate hydrogen bond with A at position 2 (‘A2’) in the Ypr013c site. (B) Arg at position −1 can form a bidentate hydrogen bond with G at position 3 in the Mig1 site. (C) Correlation between helix positions breaks the simple logic of a ZF recognition code: Gln at helix position 6 forms a pair of hydrogen bonds with C at position 1 and with the T paired with an A at position 2, when Asn is also present at helix position 3. Gln at helix position 6 had been proposed to specify A rather than C at position 1 (28).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3113574&req=5

Figure 10: Structural basis of binding specificity: determinants of nucleotide preferences in the fragment assembly models are analyzed at 3 sites in the benchmark set. The base at the site of interest is colored yellow; interacting sidechains are colored green; additional interacting bases and sidechains shown in pink and purple. (A) Asn at helix position 3 (‘N3’) makes a bidentate hydrogen bond with A at position 2 (‘A2’) in the Ypr013c site. (B) Arg at position −1 can form a bidentate hydrogen bond with G at position 3 in the Mig1 site. (C) Correlation between helix positions breaks the simple logic of a ZF recognition code: Gln at helix position 6 forms a pair of hydrogen bonds with C at position 1 and with the T paired with an A at position 2, when Asn is also present at helix position 3. Gln at helix position 6 had been proposed to specify A rather than C at position 1 (28).
Mentions: Binding specificity predictions. For each benchmark protein, the experimental binding profile is shown above the structure-based specificity prediction. Experimental data sources are indicated in brackets. PFM columns are numbered so that columns 1-3 correspond to the last finger, columns 4-6 correspond to the second-to-last finger, and so on (see Figure 4). For the three boxed columns, structural determinants of binding preferences are illustrated in Figure 10.

Bottom Line: Sequence-specific DNA recognition by gene regulatory proteins is critical for proper cellular functioning.Here, we present a novel molecular modeling protocol for protein-DNA interfaces that borrows conformational sampling techniques from de novo protein structure prediction to generate a diverse ensemble of structural models from small fragments of related and unrelated protein-DNA complexes.The extensive conformational sampling is coupled with sequence space exploration so that binding preferences for the target protein can be inferred from the resulting optimized DNA sequences.

View Article: PubMed Central - PubMed

Affiliation: Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024, USA.

ABSTRACT
Sequence-specific DNA recognition by gene regulatory proteins is critical for proper cellular functioning. The ability to predict the DNA binding preferences of these regulatory proteins from their amino acid sequence would greatly aid in reconstruction of their regulatory interactions. Structural modeling provides one route to such predictions: by building accurate molecular models of regulatory proteins in complex with candidate binding sites, and estimating their relative binding affinities for these sites using a suitable potential function, it should be possible to construct DNA binding profiles. Here, we present a novel molecular modeling protocol for protein-DNA interfaces that borrows conformational sampling techniques from de novo protein structure prediction to generate a diverse ensemble of structural models from small fragments of related and unrelated protein-DNA complexes. The extensive conformational sampling is coupled with sequence space exploration so that binding preferences for the target protein can be inferred from the resulting optimized DNA sequences. We apply the algorithm to predict binding profiles for a benchmark set of eleven C2H2 zinc finger transcription factors, five of known and six of unknown structure. The predicted profiles are in good agreement with experimental binding data; furthermore, examination of the modeled structures gives insight into observed binding preferences.

Show MeSH