Limits...
Structure-based predictive models for allosteric hot spots.

Demerdash ON, Daily MD, Mitchell JC - PLoS Comput. Biol. (2009)

Bottom Line: Each residue had an associated set of calculated features.We combined the features from each set that produced models with optimal predictive performance.The top 10 models using this hybrid feature set had R = 73-81% and P = 64-71%, the best overall performance of any of the sets of models.

View Article: PubMed Central - PubMed

Affiliation: Biophysics Program, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.

ABSTRACT
In allostery, a binding event at one site in a protein modulates the behavior of a distant site. Identifying residues that relay the signal between sites remains a challenge. We have developed predictive models using support-vector machines, a widely used machine-learning method. The training data set consisted of residues classified as either hotspots or non-hotspots based on experimental characterization of point mutations from a diverse set of allosteric proteins. Each residue had an associated set of calculated features. Two sets of features were used, one consisting of dynamical, structural, network, and informatic measures, and another of structural measures defined by Daily and Gray. The resulting models performed well on an independent data set consisting of hotspots and non-hotspots from five allosteric proteins. For the independent data set, our top 10 models using Feature Set 1 recalled 68-81% of known hotspots, and among total hotspot predictions, 58-67% were actual hotspots. Hence, these models have precision P = 58-67% and recall R = 68-81%. The corresponding models for Feature Set 2 had P = 55-59% and R = 81-92%. We combined the features from each set that produced models with optimal predictive performance. The top 10 models using this hybrid feature set had R = 73-81% and P = 64-71%, the best overall performance of any of the sets of models. Our methods identified hotspots in structural regions of known allosteric significance. Moreover, our predicted hotspots form a network of contiguous residues in the interior of the structures, in agreement with previous work. In conclusion, we have developed models that discriminate between known allosteric hotspots and non-hotspots with high accuracy and sensitivity. Moreover, the pattern of predicted hotspots corresponds to known functional motifs implicated in allostery, and is consistent with previous work describing sparse networks of allosterically important residues.

Show MeSH

Related in: MedlinePlus

Hotspot predictions mapped to the inactive state structure of lac repressor.(A) Predictions made by the top 9 highest-precision Hybrid Feature Set models according to the voting scheme for lac repressor mapped onto the inactive state structure (1tlf). Experimentally tested residues rendered in van der Waals spheres, with known non-hotspots in small van der Waals spheres and known hotspots in larger ones. For other residues, the prediction is shown along the backbone trace, but no experimental data is available to test the prediction. Each residue in the structure is colored according to a blue→green→red heat map, where the extremes are as follows: red represents residues predicted to be hotspots by 9/9 of the models and blue residues to be predicted hotspots by 0/9 models (predicted non-hotspots by 9/9 models). (Refer to color bar above for exact mapping of the number of predicted hotspots to the color.) For ease of viewing only one set of dimers (chain A and B) is shown. His 74 and Asp 278, residues not in the independent data set but were studied experimentally and found to be allosterically active, are rendered in van der Waals mode as well [63]. Correct positive (hotspot) and negative (non-hotspot) predictions are colored according to the heat map, while false predictions are colored gray. The inducer molecule IPTG is rendered as sticks and colored by element. (B) Here the complete set of residues that caused the IS phenotype are rendered in van der Waals spheres. The hotspots depicted in A. are a subset of these for which no substitution caused an I− phenotype (completely nonfunctional). Incorrect predictions, i.e. false negatives, are colored in gray.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2748687&req=5

pcbi-1000531-g004: Hotspot predictions mapped to the inactive state structure of lac repressor.(A) Predictions made by the top 9 highest-precision Hybrid Feature Set models according to the voting scheme for lac repressor mapped onto the inactive state structure (1tlf). Experimentally tested residues rendered in van der Waals spheres, with known non-hotspots in small van der Waals spheres and known hotspots in larger ones. For other residues, the prediction is shown along the backbone trace, but no experimental data is available to test the prediction. Each residue in the structure is colored according to a blue→green→red heat map, where the extremes are as follows: red represents residues predicted to be hotspots by 9/9 of the models and blue residues to be predicted hotspots by 0/9 models (predicted non-hotspots by 9/9 models). (Refer to color bar above for exact mapping of the number of predicted hotspots to the color.) For ease of viewing only one set of dimers (chain A and B) is shown. His 74 and Asp 278, residues not in the independent data set but were studied experimentally and found to be allosterically active, are rendered in van der Waals mode as well [63]. Correct positive (hotspot) and negative (non-hotspot) predictions are colored according to the heat map, while false predictions are colored gray. The inducer molecule IPTG is rendered as sticks and colored by element. (B) Here the complete set of residues that caused the IS phenotype are rendered in van der Waals spheres. The hotspots depicted in A. are a subset of these for which no substitution caused an I− phenotype (completely nonfunctional). Incorrect predictions, i.e. false negatives, are colored in gray.

Mentions: Furthermore, the locations of predicted hotspots and non-hotspots in the protein structure and the known functions of the structural elements of each protein system gave insight into the functional significance of the predictions. For lac repressor, a large number of predicted hotspots were found at the monomer-monomer interface (Figure 4a; Table S3), especially where the respective N-terminal domains of the two monomers interact. At this interface, significant alterations of residue-residue interactions occur in the allosteric transition [58]–[60]. Mutations in this region result in a non-inducible (i.e. allosterically unresponsive) phenotype [61],[62]. In addition to residues designated as hotspots and non-hotspots that were included in the independent data set, a key interaction at the monomer-monomer interface, a salt bridge between His 74 and Asp 278 that has been found to be important for the allosteric transition in this system [63], is highlighted in Figure 4a. Both of these residues were predicted hotspots by a majority of the models. A striking observation is the asymmetry of some of the predictions between the monomers. For instance, Lys 84, a known hotspot in the independent data set, is a predicted hotspot by 8 out of 9 models in chain A, but in chain B, it is a predicted hotspot by 4 out of 9 models (Table S3). This is consistent with the observed crystallographic asymmetry between the monomers, especially in the vicinity of Asp 149 [60],[64],[65]. Indeed, we found an all-atom RMSD of 5.55 Å between chain A and B in the inducer-bound state (PDB code 1TLF). Moreover, our finding is supported by the MD simulation study of Flynn et al. [64], who observe structural asymmetry between monomers during targeted MD simulations of the allosteric transition from the DNA-bound to the inducer-bound state and even during the equilibration phase, with an allosteric signal originating in the “trigger” monomer propagating to the “response” monomer.


Structure-based predictive models for allosteric hot spots.

Demerdash ON, Daily MD, Mitchell JC - PLoS Comput. Biol. (2009)

Hotspot predictions mapped to the inactive state structure of lac repressor.(A) Predictions made by the top 9 highest-precision Hybrid Feature Set models according to the voting scheme for lac repressor mapped onto the inactive state structure (1tlf). Experimentally tested residues rendered in van der Waals spheres, with known non-hotspots in small van der Waals spheres and known hotspots in larger ones. For other residues, the prediction is shown along the backbone trace, but no experimental data is available to test the prediction. Each residue in the structure is colored according to a blue→green→red heat map, where the extremes are as follows: red represents residues predicted to be hotspots by 9/9 of the models and blue residues to be predicted hotspots by 0/9 models (predicted non-hotspots by 9/9 models). (Refer to color bar above for exact mapping of the number of predicted hotspots to the color.) For ease of viewing only one set of dimers (chain A and B) is shown. His 74 and Asp 278, residues not in the independent data set but were studied experimentally and found to be allosterically active, are rendered in van der Waals mode as well [63]. Correct positive (hotspot) and negative (non-hotspot) predictions are colored according to the heat map, while false predictions are colored gray. The inducer molecule IPTG is rendered as sticks and colored by element. (B) Here the complete set of residues that caused the IS phenotype are rendered in van der Waals spheres. The hotspots depicted in A. are a subset of these for which no substitution caused an I− phenotype (completely nonfunctional). Incorrect predictions, i.e. false negatives, are colored in gray.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2748687&req=5

pcbi-1000531-g004: Hotspot predictions mapped to the inactive state structure of lac repressor.(A) Predictions made by the top 9 highest-precision Hybrid Feature Set models according to the voting scheme for lac repressor mapped onto the inactive state structure (1tlf). Experimentally tested residues rendered in van der Waals spheres, with known non-hotspots in small van der Waals spheres and known hotspots in larger ones. For other residues, the prediction is shown along the backbone trace, but no experimental data is available to test the prediction. Each residue in the structure is colored according to a blue→green→red heat map, where the extremes are as follows: red represents residues predicted to be hotspots by 9/9 of the models and blue residues to be predicted hotspots by 0/9 models (predicted non-hotspots by 9/9 models). (Refer to color bar above for exact mapping of the number of predicted hotspots to the color.) For ease of viewing only one set of dimers (chain A and B) is shown. His 74 and Asp 278, residues not in the independent data set but were studied experimentally and found to be allosterically active, are rendered in van der Waals mode as well [63]. Correct positive (hotspot) and negative (non-hotspot) predictions are colored according to the heat map, while false predictions are colored gray. The inducer molecule IPTG is rendered as sticks and colored by element. (B) Here the complete set of residues that caused the IS phenotype are rendered in van der Waals spheres. The hotspots depicted in A. are a subset of these for which no substitution caused an I− phenotype (completely nonfunctional). Incorrect predictions, i.e. false negatives, are colored in gray.
Mentions: Furthermore, the locations of predicted hotspots and non-hotspots in the protein structure and the known functions of the structural elements of each protein system gave insight into the functional significance of the predictions. For lac repressor, a large number of predicted hotspots were found at the monomer-monomer interface (Figure 4a; Table S3), especially where the respective N-terminal domains of the two monomers interact. At this interface, significant alterations of residue-residue interactions occur in the allosteric transition [58]–[60]. Mutations in this region result in a non-inducible (i.e. allosterically unresponsive) phenotype [61],[62]. In addition to residues designated as hotspots and non-hotspots that were included in the independent data set, a key interaction at the monomer-monomer interface, a salt bridge between His 74 and Asp 278 that has been found to be important for the allosteric transition in this system [63], is highlighted in Figure 4a. Both of these residues were predicted hotspots by a majority of the models. A striking observation is the asymmetry of some of the predictions between the monomers. For instance, Lys 84, a known hotspot in the independent data set, is a predicted hotspot by 8 out of 9 models in chain A, but in chain B, it is a predicted hotspot by 4 out of 9 models (Table S3). This is consistent with the observed crystallographic asymmetry between the monomers, especially in the vicinity of Asp 149 [60],[64],[65]. Indeed, we found an all-atom RMSD of 5.55 Å between chain A and B in the inducer-bound state (PDB code 1TLF). Moreover, our finding is supported by the MD simulation study of Flynn et al. [64], who observe structural asymmetry between monomers during targeted MD simulations of the allosteric transition from the DNA-bound to the inducer-bound state and even during the equilibration phase, with an allosteric signal originating in the “trigger” monomer propagating to the “response” monomer.

Bottom Line: Each residue had an associated set of calculated features.We combined the features from each set that produced models with optimal predictive performance.The top 10 models using this hybrid feature set had R = 73-81% and P = 64-71%, the best overall performance of any of the sets of models.

View Article: PubMed Central - PubMed

Affiliation: Biophysics Program, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.

ABSTRACT
In allostery, a binding event at one site in a protein modulates the behavior of a distant site. Identifying residues that relay the signal between sites remains a challenge. We have developed predictive models using support-vector machines, a widely used machine-learning method. The training data set consisted of residues classified as either hotspots or non-hotspots based on experimental characterization of point mutations from a diverse set of allosteric proteins. Each residue had an associated set of calculated features. Two sets of features were used, one consisting of dynamical, structural, network, and informatic measures, and another of structural measures defined by Daily and Gray. The resulting models performed well on an independent data set consisting of hotspots and non-hotspots from five allosteric proteins. For the independent data set, our top 10 models using Feature Set 1 recalled 68-81% of known hotspots, and among total hotspot predictions, 58-67% were actual hotspots. Hence, these models have precision P = 58-67% and recall R = 68-81%. The corresponding models for Feature Set 2 had P = 55-59% and R = 81-92%. We combined the features from each set that produced models with optimal predictive performance. The top 10 models using this hybrid feature set had R = 73-81% and P = 64-71%, the best overall performance of any of the sets of models. Our methods identified hotspots in structural regions of known allosteric significance. Moreover, our predicted hotspots form a network of contiguous residues in the interior of the structures, in agreement with previous work. In conclusion, we have developed models that discriminate between known allosteric hotspots and non-hotspots with high accuracy and sensitivity. Moreover, the pattern of predicted hotspots corresponds to known functional motifs implicated in allostery, and is consistent with previous work describing sparse networks of allosterically important residues.

Show MeSH
Related in: MedlinePlus