Limits...
SCMCRYS: predicting protein crystallization using an ensemble scoring card method with estimating propensity scores of P-collocated amino acid pairs.

Charoenkwan P, Shoombuatong W, Lee HC, Chaijaruwanich J, Huang HL, Ho SY - PLoS ONE (2013)

Bottom Line: The experimental results show that the single SCM classifier utilizing dipeptide composition with accuracy of 73.90% is comparable to the best previously-developed SVM-based classifier, SVM_POLY (74.6%), and our proposed SVM-based classifier utilizing the same dipeptide composition (77.55%).The SCMCRYS method with accuracy of 76.1% is comparable to the state-of-the-art ensemble methods PPCpred (76.8%) and RFCRYS (80.0%), which used the SVM and Random Forest classifiers, respectively.The propensity scores of amino acids and dipeptides for estimating the protein crystallizability can aid biologists in designing mutation of surface residues to enhance protein crystallizability.

View Article: PubMed Central - PubMed

Affiliation: Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan.

ABSTRACT
Existing methods for predicting protein crystallization obtain high accuracy using various types of complemented features and complex ensemble classifiers, such as support vector machine (SVM) and Random Forest classifiers. It is desirable to develop a simple and easily interpretable prediction method with informative sequence features to provide insights into protein crystallization. This study proposes an ensemble method, SCMCRYS, to predict protein crystallization, for which each classifier is built by using a scoring card method (SCM) with estimating propensity scores of p-collocated amino acid (AA) pairs (p=0 for a dipeptide). The SCM classifier determines the crystallization of a sequence according to a weighted-sum score. The weights are the composition of the p-collocated AA pairs, and the propensity scores of these AA pairs are estimated using a statistic with optimization approach. SCMCRYS predicts the crystallization using a simple voting method from a number of SCM classifiers. The experimental results show that the single SCM classifier utilizing dipeptide composition with accuracy of 73.90% is comparable to the best previously-developed SVM-based classifier, SVM_POLY (74.6%), and our proposed SVM-based classifier utilizing the same dipeptide composition (77.55%). The SCMCRYS method with accuracy of 76.1% is comparable to the state-of-the-art ensemble methods PPCpred (76.8%) and RFCRYS (80.0%), which used the SVM and Random Forest classifiers, respectively. This study also investigates mutagenesis analysis based on SCM and the result reveals the hypothesis that the mutagenesis of surface residues Ala and Cys has large and small probabilities of enhancing protein crystallizability considering the estimated scores of crystallizability and solubility, melting point, molecular weight and conformational entropy of amino acids in a generalized condition. The propensity scores of amino acids and dipeptides for estimating the protein crystallizability can aid biologists in designing mutation of surface residues to enhance protein crystallizability. The source code of SCMCRYS is available at http://iclab.life.nctu.edu.tw/SCMCRYS/.

Show MeSH

Related in: MedlinePlus

The three-dimensional structure of Rho GDP-dissociation inhibitor.(a) The predicted structure of a wild type Rho GDP-dissociation inhibitor and (b) The structure of a mutant Rho GDP-dissociation inhibitor (NDelta66: K135,138,141A;L196F mutant; 1fso).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3760885&req=5

pone-0072368-g004: The three-dimensional structure of Rho GDP-dissociation inhibitor.(a) The predicted structure of a wild type Rho GDP-dissociation inhibitor and (b) The structure of a mutant Rho GDP-dissociation inhibitor (NDelta66: K135,138,141A;L196F mutant; 1fso).

Mentions: Longenecker et al. [24] reported that the Lys→Ala mutations enhanced the crystallization of human RhoGDI mutants compared to the wild type (not crystallizable). Figure 4 shows the three-dimensional structure of Rho GDP-dissociation inhibitor with a) the predicted structure of its wild type obtained by (PS)2 (Protein Structure Prediction Server) [46], and b) its NDelta66: K135,138,141A;L196F mutant, 1fso [24], which are generated using PyMOL [47]. Mutation of large flexible surface amino acids to the smaller residues with no conformational entropy might lead to enhancement of crystallization. Additionally, the ranks of Ala are higher than those of Lys in all aspects of propensity scores and biophysical properties. All the results of single and triple mutants support our hypothesis to enhance proteins' ability to crystallize [24].


SCMCRYS: predicting protein crystallization using an ensemble scoring card method with estimating propensity scores of P-collocated amino acid pairs.

Charoenkwan P, Shoombuatong W, Lee HC, Chaijaruwanich J, Huang HL, Ho SY - PLoS ONE (2013)

The three-dimensional structure of Rho GDP-dissociation inhibitor.(a) The predicted structure of a wild type Rho GDP-dissociation inhibitor and (b) The structure of a mutant Rho GDP-dissociation inhibitor (NDelta66: K135,138,141A;L196F mutant; 1fso).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3760885&req=5

pone-0072368-g004: The three-dimensional structure of Rho GDP-dissociation inhibitor.(a) The predicted structure of a wild type Rho GDP-dissociation inhibitor and (b) The structure of a mutant Rho GDP-dissociation inhibitor (NDelta66: K135,138,141A;L196F mutant; 1fso).
Mentions: Longenecker et al. [24] reported that the Lys→Ala mutations enhanced the crystallization of human RhoGDI mutants compared to the wild type (not crystallizable). Figure 4 shows the three-dimensional structure of Rho GDP-dissociation inhibitor with a) the predicted structure of its wild type obtained by (PS)2 (Protein Structure Prediction Server) [46], and b) its NDelta66: K135,138,141A;L196F mutant, 1fso [24], which are generated using PyMOL [47]. Mutation of large flexible surface amino acids to the smaller residues with no conformational entropy might lead to enhancement of crystallization. Additionally, the ranks of Ala are higher than those of Lys in all aspects of propensity scores and biophysical properties. All the results of single and triple mutants support our hypothesis to enhance proteins' ability to crystallize [24].

Bottom Line: The experimental results show that the single SCM classifier utilizing dipeptide composition with accuracy of 73.90% is comparable to the best previously-developed SVM-based classifier, SVM_POLY (74.6%), and our proposed SVM-based classifier utilizing the same dipeptide composition (77.55%).The SCMCRYS method with accuracy of 76.1% is comparable to the state-of-the-art ensemble methods PPCpred (76.8%) and RFCRYS (80.0%), which used the SVM and Random Forest classifiers, respectively.The propensity scores of amino acids and dipeptides for estimating the protein crystallizability can aid biologists in designing mutation of surface residues to enhance protein crystallizability.

View Article: PubMed Central - PubMed

Affiliation: Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan.

ABSTRACT
Existing methods for predicting protein crystallization obtain high accuracy using various types of complemented features and complex ensemble classifiers, such as support vector machine (SVM) and Random Forest classifiers. It is desirable to develop a simple and easily interpretable prediction method with informative sequence features to provide insights into protein crystallization. This study proposes an ensemble method, SCMCRYS, to predict protein crystallization, for which each classifier is built by using a scoring card method (SCM) with estimating propensity scores of p-collocated amino acid (AA) pairs (p=0 for a dipeptide). The SCM classifier determines the crystallization of a sequence according to a weighted-sum score. The weights are the composition of the p-collocated AA pairs, and the propensity scores of these AA pairs are estimated using a statistic with optimization approach. SCMCRYS predicts the crystallization using a simple voting method from a number of SCM classifiers. The experimental results show that the single SCM classifier utilizing dipeptide composition with accuracy of 73.90% is comparable to the best previously-developed SVM-based classifier, SVM_POLY (74.6%), and our proposed SVM-based classifier utilizing the same dipeptide composition (77.55%). The SCMCRYS method with accuracy of 76.1% is comparable to the state-of-the-art ensemble methods PPCpred (76.8%) and RFCRYS (80.0%), which used the SVM and Random Forest classifiers, respectively. This study also investigates mutagenesis analysis based on SCM and the result reveals the hypothesis that the mutagenesis of surface residues Ala and Cys has large and small probabilities of enhancing protein crystallizability considering the estimated scores of crystallizability and solubility, melting point, molecular weight and conformational entropy of amino acids in a generalized condition. The propensity scores of amino acids and dipeptides for estimating the protein crystallizability can aid biologists in designing mutation of surface residues to enhance protein crystallizability. The source code of SCMCRYS is available at http://iclab.life.nctu.edu.tw/SCMCRYS/.

Show MeSH
Related in: MedlinePlus