Limits...
A search for energy minimized sequences of proteins.

Jha AN, Ananthasuresh GK, Vishveshwara S - PLoS ONE (2009)

Bottom Line: We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences.In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence.The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function.

View Article: PubMed Central - PubMed

Affiliation: Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India.

ABSTRACT
In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with minimum energy. We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences. Our methods enable the computation of a lower bound as well as a tight upper bound for the energy of a given conformation. We validate our results by using three different inter-residue energy matrices for five proteins from protein data bank (PDB), and by comparing our energy-minimizing sequences with 80 million diverse sequences that are generated based on different considerations in each case. When we submitted some of our chosen energy-minimizing sequences to Basic Local Alignment Search Tool (BLAST), we obtained some sequences from non-redundant protein sequence database that are similar to ours with an E-value of the order of 10(-7). In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence. The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function.

Show MeSH
Energy profile of random and designed sequences.Energy distributions of random (curve 1) and designed (curves 2) sequences obtained for three proteins (a) 1ZIP, (b) 5TIM, and (c) 1I6M. The triangle marker (Δ) on the left indicates the lower bound energy while the inverted triangle marker on the right shows the energy of the native sequence. Notice that the energy of the native sequence is in between the mean energies of the random and designed sequences' energy distributions. The lower bound energy found by our method is much lower than the energy of the native sequence in all cases.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2724685&req=5

pone-0006684-g004: Energy profile of random and designed sequences.Energy distributions of random (curve 1) and designed (curves 2) sequences obtained for three proteins (a) 1ZIP, (b) 5TIM, and (c) 1I6M. The triangle marker (Δ) on the left indicates the lower bound energy while the inverted triangle marker on the right shows the energy of the native sequence. Notice that the energy of the native sequence is in between the mean energies of the random and designed sequences' energy distributions. The lower bound energy found by our method is much lower than the energy of the native sequence in all cases.

Mentions: The same behavior that was explained above for Ribonuclease A and Lysozyme was observed for the other three proteins. Figures 4a-c show that the energy of the native sequence is always straddled between the energies of the random sequences and the designed sequences. The mean energy and the standard deviation for designed and random sequences for all the chosen proteins have been summarized in Table 2. It shows that the energy of the native sequence is at least one standard deviation lower than the mean energy of the generated random sequences whereas it is very high (in the range of 7–22 standard deviations) than the mean energy of the designed sequences. Another interesting point is that the standard deviation for the random sequences is much larger than of the designed sequences. It shows that the energies of the designed sequences are not widely spread like random sequences and the Gaussian distribution for them have a sharp peak. The same behavior was seen in all the five proteins that we considered here. The mean of the energy of the random sequences gives a tight upper bound on the energy for a given target conformation. Thus, we have been able to provide a lower and an upper bound for the energy of the sequences for a given protein.


A search for energy minimized sequences of proteins.

Jha AN, Ananthasuresh GK, Vishveshwara S - PLoS ONE (2009)

Energy profile of random and designed sequences.Energy distributions of random (curve 1) and designed (curves 2) sequences obtained for three proteins (a) 1ZIP, (b) 5TIM, and (c) 1I6M. The triangle marker (Δ) on the left indicates the lower bound energy while the inverted triangle marker on the right shows the energy of the native sequence. Notice that the energy of the native sequence is in between the mean energies of the random and designed sequences' energy distributions. The lower bound energy found by our method is much lower than the energy of the native sequence in all cases.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2724685&req=5

pone-0006684-g004: Energy profile of random and designed sequences.Energy distributions of random (curve 1) and designed (curves 2) sequences obtained for three proteins (a) 1ZIP, (b) 5TIM, and (c) 1I6M. The triangle marker (Δ) on the left indicates the lower bound energy while the inverted triangle marker on the right shows the energy of the native sequence. Notice that the energy of the native sequence is in between the mean energies of the random and designed sequences' energy distributions. The lower bound energy found by our method is much lower than the energy of the native sequence in all cases.
Mentions: The same behavior that was explained above for Ribonuclease A and Lysozyme was observed for the other three proteins. Figures 4a-c show that the energy of the native sequence is always straddled between the energies of the random sequences and the designed sequences. The mean energy and the standard deviation for designed and random sequences for all the chosen proteins have been summarized in Table 2. It shows that the energy of the native sequence is at least one standard deviation lower than the mean energy of the generated random sequences whereas it is very high (in the range of 7–22 standard deviations) than the mean energy of the designed sequences. Another interesting point is that the standard deviation for the random sequences is much larger than of the designed sequences. It shows that the energies of the designed sequences are not widely spread like random sequences and the Gaussian distribution for them have a sharp peak. The same behavior was seen in all the five proteins that we considered here. The mean of the energy of the random sequences gives a tight upper bound on the energy for a given target conformation. Thus, we have been able to provide a lower and an upper bound for the energy of the sequences for a given protein.

Bottom Line: We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences.In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence.The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function.

View Article: PubMed Central - PubMed

Affiliation: Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India.

ABSTRACT
In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with minimum energy. We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences. Our methods enable the computation of a lower bound as well as a tight upper bound for the energy of a given conformation. We validate our results by using three different inter-residue energy matrices for five proteins from protein data bank (PDB), and by comparing our energy-minimizing sequences with 80 million diverse sequences that are generated based on different considerations in each case. When we submitted some of our chosen energy-minimizing sequences to Basic Local Alignment Search Tool (BLAST), we obtained some sequences from non-redundant protein sequence database that are similar to ours with an E-value of the order of 10(-7). In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence. The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function.

Show MeSH