Limits...
Sparse estimation for structural variability.

Hosur R, Singh R, Berger B - Algorithms Mol Biol (2011)

Bottom Line: Our results indicate that the algorithm is able to accurately distinguish genuine conformational changes from variability due to noise.In addition to improved performance over existing methods, the algorithm is robust to the levels of noise present in real data.Our algorithm is also general enough to be integrated into state-of-the-art software tools for structure-inference.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA. bab@mit.edu.

ABSTRACT

Background: Proteins are dynamic molecules that exhibit a wide range of motions; often these conformational changes are important for protein function. Determining biologically relevant conformational changes, or true variability, efficiently is challenging due to the noise present in structure data.

Results: In this paper we present a novel approach to elucidate conformational variability in structures solved using X-ray crystallography. We first infer an ensemble to represent the experimental data and then formulate the identification of truly variable members of the ensemble (as opposed to those that vary only due to noise) as a sparse estimation problem. Our results indicate that the algorithm is able to accurately distinguish genuine conformational changes from variability due to noise. We validate our predictions for structures in the Protein Data Bank by comparing with NMR experiments, as well as on synthetic data. In addition to improved performance over existing methods, the algorithm is robust to the levels of noise present in real data. In the case of Human Ubiquitin-conjugating enzyme Ubc9, variability identified by the algorithm corresponds to functionally important residues implicated by mutagenesis experiments. Our algorithm is also general enough to be integrated into state-of-the-art software tools for structure-inference.

No MeSH data available.


Flexibility analysis of the 1a3s ensemble. A) Residue level Lasso with a window size of 5 reveals four fragments (peaks) of potential interest: 6-15, 30-40, 115-120 and135-142. B) The N-terminal region (12-20) of 1a3s. Multiple rotamers of R13 (left, red) might affect the interaction surface consisting of R18 (red), K14 and K18 (yellow), thus influencing Ubc9's N-terminus specificity. C) Variability around the catalytic site Cys93 (yellow). Residues Gln126 (brown) and Asp127 (green) have been identified through mutagenesis experiments as critical for Ubc9's interaction with a substrate. The black structure represents PDB coordinates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3101643&req=5

Figure 6: Flexibility analysis of the 1a3s ensemble. A) Residue level Lasso with a window size of 5 reveals four fragments (peaks) of potential interest: 6-15, 30-40, 115-120 and135-142. B) The N-terminal region (12-20) of 1a3s. Multiple rotamers of R13 (left, red) might affect the interaction surface consisting of R18 (red), K14 and K18 (yellow), thus influencing Ubc9's N-terminus specificity. C) Variability around the catalytic site Cys93 (yellow). Residues Gln126 (brown) and Asp127 (green) have been identified through mutagenesis experiments as critical for Ubc9's interaction with a substrate. The black structure represents PDB coordinates.

Mentions: We then asked the question: "is there any biological insight from the ensemble that can help us in understanding protein function ?" To this end, our results on the crystal structure of the human ubiquitin-conjugating enzyme (Ubc9, pdbid: 1a3s) give some interesting anecdotal evidence [26]. Using a window-size of 5 centered on each residue, we applied Lasso to identify the most variable regions for 1a3s (11 structures; Figure 6A). Four fragments turn out to be highly variable: the N-terminal helix (6-20), 30-40, 115-120 and C-terminus residues 135-145. This is in good agreement with NMR experiments, which reveal that Leu6, Ala10, Arg13, Arg17, Leu38, Leu119, Gln126, Asp127, Ala129, Glu132, Ile136 and Asn140 are amongst the most flexible residues in an otherwise rigid structure [27,28]. These residues overlap with our predictions of the true variable regions (Figure 6A). Our method is thus able to identify physically relevant variabilities.


Sparse estimation for structural variability.

Hosur R, Singh R, Berger B - Algorithms Mol Biol (2011)

Flexibility analysis of the 1a3s ensemble. A) Residue level Lasso with a window size of 5 reveals four fragments (peaks) of potential interest: 6-15, 30-40, 115-120 and135-142. B) The N-terminal region (12-20) of 1a3s. Multiple rotamers of R13 (left, red) might affect the interaction surface consisting of R18 (red), K14 and K18 (yellow), thus influencing Ubc9's N-terminus specificity. C) Variability around the catalytic site Cys93 (yellow). Residues Gln126 (brown) and Asp127 (green) have been identified through mutagenesis experiments as critical for Ubc9's interaction with a substrate. The black structure represents PDB coordinates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3101643&req=5

Figure 6: Flexibility analysis of the 1a3s ensemble. A) Residue level Lasso with a window size of 5 reveals four fragments (peaks) of potential interest: 6-15, 30-40, 115-120 and135-142. B) The N-terminal region (12-20) of 1a3s. Multiple rotamers of R13 (left, red) might affect the interaction surface consisting of R18 (red), K14 and K18 (yellow), thus influencing Ubc9's N-terminus specificity. C) Variability around the catalytic site Cys93 (yellow). Residues Gln126 (brown) and Asp127 (green) have been identified through mutagenesis experiments as critical for Ubc9's interaction with a substrate. The black structure represents PDB coordinates.
Mentions: We then asked the question: "is there any biological insight from the ensemble that can help us in understanding protein function ?" To this end, our results on the crystal structure of the human ubiquitin-conjugating enzyme (Ubc9, pdbid: 1a3s) give some interesting anecdotal evidence [26]. Using a window-size of 5 centered on each residue, we applied Lasso to identify the most variable regions for 1a3s (11 structures; Figure 6A). Four fragments turn out to be highly variable: the N-terminal helix (6-20), 30-40, 115-120 and C-terminus residues 135-145. This is in good agreement with NMR experiments, which reveal that Leu6, Ala10, Arg13, Arg17, Leu38, Leu119, Gln126, Asp127, Ala129, Glu132, Ile136 and Asn140 are amongst the most flexible residues in an otherwise rigid structure [27,28]. These residues overlap with our predictions of the true variable regions (Figure 6A). Our method is thus able to identify physically relevant variabilities.

Bottom Line: Our results indicate that the algorithm is able to accurately distinguish genuine conformational changes from variability due to noise.In addition to improved performance over existing methods, the algorithm is robust to the levels of noise present in real data.Our algorithm is also general enough to be integrated into state-of-the-art software tools for structure-inference.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA. bab@mit.edu.

ABSTRACT

Background: Proteins are dynamic molecules that exhibit a wide range of motions; often these conformational changes are important for protein function. Determining biologically relevant conformational changes, or true variability, efficiently is challenging due to the noise present in structure data.

Results: In this paper we present a novel approach to elucidate conformational variability in structures solved using X-ray crystallography. We first infer an ensemble to represent the experimental data and then formulate the identification of truly variable members of the ensemble (as opposed to those that vary only due to noise) as a sparse estimation problem. Our results indicate that the algorithm is able to accurately distinguish genuine conformational changes from variability due to noise. We validate our predictions for structures in the Protein Data Bank by comparing with NMR experiments, as well as on synthetic data. In addition to improved performance over existing methods, the algorithm is robust to the levels of noise present in real data. In the case of Human Ubiquitin-conjugating enzyme Ubc9, variability identified by the algorithm corresponds to functionally important residues implicated by mutagenesis experiments. Our algorithm is also general enough to be integrated into state-of-the-art software tools for structure-inference.

No MeSH data available.