Limits...
CRHunter: integrating multifaceted information to predict catalytic residues in enzymes

View Article: PubMed Central - PubMed

ABSTRACT

A variety of algorithms have been developed for catalytic residue prediction based on either feature- or template-based methodology. However, no studies have systematically compared these two strategies and further considered whether their combination could improve the prediction performance. Herein, we developed an integrative algorithm named CRHunter by simultaneously using the complementarity between feature- and template-based methodologies and that between structural and sequence information. Several novel structural features were generated by the Delaunay triangulation and Laplacian transformation of enzyme structures. Combining these features with traditional descriptors, we invented two support vector machine feature predictors based on both structural and sequence information. Furthermore, we established two template predictors using structure and profile alignments. Evaluated on datasets with different levels of homology, our feature predictors achieve relatively stable performance, whereas our template predictors yield poor results when the homological relationships become weak. Nevertheless, the hybrid algorithm CRHunter consistently achieves optimal performance among all our predictors. We also illustrate that our methodology can be applied to the predicted structures of enzymes. Compared with state-of-the-art methods, CRHunter yields comparable or better performance on various datasets. Finally, the application of this algorithm to structural genomics targets sheds light on solved protein structures with unknown functions.

No MeSH data available.


Schematic representation of the CRHunter algorithm.CRHunter is divided into two partitions, namely StrHunter and SeqHunter, both of which further comprise feature- and template-based predictors.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5036049&req=5

f1: Schematic representation of the CRHunter algorithm.CRHunter is divided into two partitions, namely StrHunter and SeqHunter, both of which further comprise feature- and template-based predictors.

Mentions: As shown in Fig. 1, the proposed system is separated into two partitions, namely structure- and sequence-based prediction modules, which further comprise feature- and template-based predictors, respectively. Regarding structure-based prediction, our template method can locate potential catalytic residues based on the global structural similarity between the query enzyme and its well-aligned reference structures, and our feature method provides complementary signatures by combining machine learning techniques and local structural characterization. In contrast, because protein structures have not been solved for all enzymes, we extended the integrative strategy of our structure-based module to sequence-based prediction. Therefore, effective sequence characteristics were extracted as the inputs of the other feature predictor, and structure alignment was replaced by sequence profile alignment in the template predictor. These four component predictors were integrated to establish our ultimate prediction algorithm.


CRHunter: integrating multifaceted information to predict catalytic residues in enzymes
Schematic representation of the CRHunter algorithm.CRHunter is divided into two partitions, namely StrHunter and SeqHunter, both of which further comprise feature- and template-based predictors.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5036049&req=5

f1: Schematic representation of the CRHunter algorithm.CRHunter is divided into two partitions, namely StrHunter and SeqHunter, both of which further comprise feature- and template-based predictors.
Mentions: As shown in Fig. 1, the proposed system is separated into two partitions, namely structure- and sequence-based prediction modules, which further comprise feature- and template-based predictors, respectively. Regarding structure-based prediction, our template method can locate potential catalytic residues based on the global structural similarity between the query enzyme and its well-aligned reference structures, and our feature method provides complementary signatures by combining machine learning techniques and local structural characterization. In contrast, because protein structures have not been solved for all enzymes, we extended the integrative strategy of our structure-based module to sequence-based prediction. Therefore, effective sequence characteristics were extracted as the inputs of the other feature predictor, and structure alignment was replaced by sequence profile alignment in the template predictor. These four component predictors were integrated to establish our ultimate prediction algorithm.

View Article: PubMed Central - PubMed

ABSTRACT

A variety of algorithms have been developed for catalytic residue prediction based on either feature- or template-based methodology. However, no studies have systematically compared these two strategies and further considered whether their combination could improve the prediction performance. Herein, we developed an integrative algorithm named CRHunter by simultaneously using the complementarity between feature- and template-based methodologies and that between structural and sequence information. Several novel structural features were generated by the Delaunay triangulation and Laplacian transformation of enzyme structures. Combining these features with traditional descriptors, we invented two support vector machine feature predictors based on both structural and sequence information. Furthermore, we established two template predictors using structure and profile alignments. Evaluated on datasets with different levels of homology, our feature predictors achieve relatively stable performance, whereas our template predictors yield poor results when the homological relationships become weak. Nevertheless, the hybrid algorithm CRHunter consistently achieves optimal performance among all our predictors. We also illustrate that our methodology can be applied to the predicted structures of enzymes. Compared with state-of-the-art methods, CRHunter yields comparable or better performance on various datasets. Finally, the application of this algorithm to structural genomics targets sheds light on solved protein structures with unknown functions.

No MeSH data available.