Limits...
Orientation-dependent backbone-only residue pair scoring functions for fixed backbone protein design.

Bordner AJ - BMC Bioinformatics (2010)

Bottom Line: The threading accuracy was found to steadily increase with increasing dimension, with the 6D scoring function achieving the highest accuracy.The sensitivity of this method to backbone structure perturbations was compared with that of fixed-backbone all-atom modeling by determining the similarities between optimal sequences for two different backbone structures within the same protein family.The results showed that the design method using 6D scoring functions was more robust to small variations in backbone structure than the all-atom design method.

View Article: PubMed Central - HTML - PubMed

Affiliation: Mayo Clinic, Scottsdale, AZ 85259, USA. bordner.andrew@mayo.edu

ABSTRACT

Background: Empirical scoring functions have proven useful in protein structure modeling. Most such scoring functions depend on protein side chain conformations. However, backbone-only scoring functions do not require computationally intensive structure optimization and so are well suited to protein design, which requires fast score evaluation. Furthermore, scoring functions that account for the distinctive relative position and orientation preferences of residue pairs are expected to be more accurate than those that depend only on the separation distance.

Results: Residue pair scoring functions for fixed backbone protein design were derived using only backbone geometry. Unlike previous studies that used spherical harmonics to fit 2D angular distributions, Gaussian Mixture Models were used to fit the full 3D (position only) and 6D (position and orientation) distributions of residue pairs. The performance of the 1D (residue separation only), 3D, and 6D scoring functions were compared by their ability to identify correct threading solutions for a non-redundant benchmark set of protein backbone structures. The threading accuracy was found to steadily increase with increasing dimension, with the 6D scoring function achieving the highest accuracy. Furthermore, the 3D and 6D scoring functions were shown to outperform side chain-dependent empirical potentials from three other studies. Next, two computational methods that take advantage of the speed and pairwise form of these new backbone-only scoring functions were investigated. The first is a procedure that exploits available sequence data by averaging scores over threading solutions for homologs. This was evaluated by applying it to the challenging problem of identifying interacting transmembrane alpha-helices and found to further improve prediction accuracy. The second is a protein design method for determining the optimal sequence for a backbone structure by applying Belief Propagation optimization using the 6D scoring functions. The sensitivity of this method to backbone structure perturbations was compared with that of fixed-backbone all-atom modeling by determining the similarities between optimal sequences for two different backbone structures within the same protein family. The results showed that the design method using 6D scoring functions was more robust to small variations in backbone structure than the all-atom design method.

Conclusions: Backbone-only residue pair scoring functions that account for all six relative degrees of freedom are the most accurate and including the scores of homologs further improves the accuracy in threading applications. The 6D scoring function outperformed several side chain-dependent potentials while avoiding time-consuming and error prone side chain structure prediction. These scoring functions are particularly useful as an initial filter in protein design problems before applying all-atom modeling.

Show MeSH

Related in: MedlinePlus

1D log-odds scores as a function of Cβ separation for Ala-Ala, Cys-Cys, and Glu-Glu residue pairs. The Cys-Cys function has a peak near the typical Cβ separation for disulfide bonds, in the range of 3.5-4.0 Å and is negative for large separations. On the contrary, the score for the same-charge Glu-Glu pairs is negative for small separations and positive for large separations, reflecting the electrostatic energy penalty for close proximity. Both the Cys-Cys and Glu-Glu scores are among the most accurate because of these physical constraints on their separations. The Ala-Ala score, shown for comparison, manifests an oscillatory behavior with a peak near that of the Cys-Cys score.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2874805&req=5

Figure 1: 1D log-odds scores as a function of Cβ separation for Ala-Ala, Cys-Cys, and Glu-Glu residue pairs. The Cys-Cys function has a peak near the typical Cβ separation for disulfide bonds, in the range of 3.5-4.0 Å and is negative for large separations. On the contrary, the score for the same-charge Glu-Glu pairs is negative for small separations and positive for large separations, reflecting the electrostatic energy penalty for close proximity. Both the Cys-Cys and Glu-Glu scores are among the most accurate because of these physical constraints on their separations. The Ala-Ala score, shown for comparison, manifests an oscillatory behavior with a peak near that of the Cys-Cys score.

Mentions: For comparison, the same analysis was also performed for the distance-dependent 1D scores. The results showed that some of the same residue pairs had the highest accuracy as for the 6D score, namely Cys-Cys as well as all combinations of Iso, Leu, and Val. However, in addition the same-charge residue pairs Glu-Glu and Asp-Glu were among the most accurately predicted residue pairs. This is likely due to the unfavorable electrostatic energy for close separations, which is reflected in negative scores at close separations for these residue pairs. Figure 1 shows a plot of the 1D residue pair scores for Cys-Cys and Glu-Glu and illustrates these trends as a function of inter-residue separation. The median AUC for the 1D scores was 0.53, which is lower than that for the 6D scores. As will be seen in the next section, the higher accuracy of the individual 6D residue scores translates into higher accuracy for total protein threading scores.


Orientation-dependent backbone-only residue pair scoring functions for fixed backbone protein design.

Bordner AJ - BMC Bioinformatics (2010)

1D log-odds scores as a function of Cβ separation for Ala-Ala, Cys-Cys, and Glu-Glu residue pairs. The Cys-Cys function has a peak near the typical Cβ separation for disulfide bonds, in the range of 3.5-4.0 Å and is negative for large separations. On the contrary, the score for the same-charge Glu-Glu pairs is negative for small separations and positive for large separations, reflecting the electrostatic energy penalty for close proximity. Both the Cys-Cys and Glu-Glu scores are among the most accurate because of these physical constraints on their separations. The Ala-Ala score, shown for comparison, manifests an oscillatory behavior with a peak near that of the Cys-Cys score.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2874805&req=5

Figure 1: 1D log-odds scores as a function of Cβ separation for Ala-Ala, Cys-Cys, and Glu-Glu residue pairs. The Cys-Cys function has a peak near the typical Cβ separation for disulfide bonds, in the range of 3.5-4.0 Å and is negative for large separations. On the contrary, the score for the same-charge Glu-Glu pairs is negative for small separations and positive for large separations, reflecting the electrostatic energy penalty for close proximity. Both the Cys-Cys and Glu-Glu scores are among the most accurate because of these physical constraints on their separations. The Ala-Ala score, shown for comparison, manifests an oscillatory behavior with a peak near that of the Cys-Cys score.
Mentions: For comparison, the same analysis was also performed for the distance-dependent 1D scores. The results showed that some of the same residue pairs had the highest accuracy as for the 6D score, namely Cys-Cys as well as all combinations of Iso, Leu, and Val. However, in addition the same-charge residue pairs Glu-Glu and Asp-Glu were among the most accurately predicted residue pairs. This is likely due to the unfavorable electrostatic energy for close separations, which is reflected in negative scores at close separations for these residue pairs. Figure 1 shows a plot of the 1D residue pair scores for Cys-Cys and Glu-Glu and illustrates these trends as a function of inter-residue separation. The median AUC for the 1D scores was 0.53, which is lower than that for the 6D scores. As will be seen in the next section, the higher accuracy of the individual 6D residue scores translates into higher accuracy for total protein threading scores.

Bottom Line: The threading accuracy was found to steadily increase with increasing dimension, with the 6D scoring function achieving the highest accuracy.The sensitivity of this method to backbone structure perturbations was compared with that of fixed-backbone all-atom modeling by determining the similarities between optimal sequences for two different backbone structures within the same protein family.The results showed that the design method using 6D scoring functions was more robust to small variations in backbone structure than the all-atom design method.

View Article: PubMed Central - HTML - PubMed

Affiliation: Mayo Clinic, Scottsdale, AZ 85259, USA. bordner.andrew@mayo.edu

ABSTRACT

Background: Empirical scoring functions have proven useful in protein structure modeling. Most such scoring functions depend on protein side chain conformations. However, backbone-only scoring functions do not require computationally intensive structure optimization and so are well suited to protein design, which requires fast score evaluation. Furthermore, scoring functions that account for the distinctive relative position and orientation preferences of residue pairs are expected to be more accurate than those that depend only on the separation distance.

Results: Residue pair scoring functions for fixed backbone protein design were derived using only backbone geometry. Unlike previous studies that used spherical harmonics to fit 2D angular distributions, Gaussian Mixture Models were used to fit the full 3D (position only) and 6D (position and orientation) distributions of residue pairs. The performance of the 1D (residue separation only), 3D, and 6D scoring functions were compared by their ability to identify correct threading solutions for a non-redundant benchmark set of protein backbone structures. The threading accuracy was found to steadily increase with increasing dimension, with the 6D scoring function achieving the highest accuracy. Furthermore, the 3D and 6D scoring functions were shown to outperform side chain-dependent empirical potentials from three other studies. Next, two computational methods that take advantage of the speed and pairwise form of these new backbone-only scoring functions were investigated. The first is a procedure that exploits available sequence data by averaging scores over threading solutions for homologs. This was evaluated by applying it to the challenging problem of identifying interacting transmembrane alpha-helices and found to further improve prediction accuracy. The second is a protein design method for determining the optimal sequence for a backbone structure by applying Belief Propagation optimization using the 6D scoring functions. The sensitivity of this method to backbone structure perturbations was compared with that of fixed-backbone all-atom modeling by determining the similarities between optimal sequences for two different backbone structures within the same protein family. The results showed that the design method using 6D scoring functions was more robust to small variations in backbone structure than the all-atom design method.

Conclusions: Backbone-only residue pair scoring functions that account for all six relative degrees of freedom are the most accurate and including the scores of homologs further improves the accuracy in threading applications. The 6D scoring function outperformed several side chain-dependent potentials while avoiding time-consuming and error prone side chain structure prediction. These scoring functions are particularly useful as an initial filter in protein design problems before applying all-atom modeling.

Show MeSH
Related in: MedlinePlus