Limits...
Highly Accurate Prediction of Protein-Protein Interactions via Incorporating Evolutionary Information and Physicochemical Characteristics

View Article: PubMed Central - PubMed

ABSTRACT

Protein-protein interactions (PPIs) occur at almost all levels of cell functions and play crucial roles in various cellular processes. Thus, identification of PPIs is critical for deciphering the molecular mechanisms and further providing insight into biological processes. Although a variety of high-throughput experimental techniques have been developed to identify PPIs, existing PPI pairs by experimental approaches only cover a small fraction of the whole PPI networks, and further, those approaches hold inherent disadvantages, such as being time-consuming, expensive, and having high false positive rate. Therefore, it is urgent and imperative to develop automatic in silico approaches to predict PPIs efficiently and accurately. In this article, we propose a novel mixture of physicochemical and evolutionary-based feature extraction method for predicting PPIs using our newly developed discriminative vector machine (DVM) classifier. The improvements of the proposed method mainly consist in introducing an effective feature extraction method that can capture discriminative features from the evolutionary-based information and physicochemical characteristics, and then a powerful and robust DVM classifier is employed. To the best of our knowledge, it is the first time that DVM model is applied to the field of bioinformatics. When applying the proposed method to the Yeast and Helicobacter pylori (H. pylori) datasets, we obtain excellent prediction accuracies of 94.35% and 90.61%, respectively. The computational results indicate that our method is effective and robust for predicting PPIs, and can be taken as a useful supplementary tool to the traditional experimental methods for future proteomics research.

No MeSH data available.


The flow chart of the proposed method.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5037676&req=5

ijms-17-01396-f002: The flow chart of the proposed method.

Mentions: In this study, the procedure of the proposed approach mainly consists of two steps: feature extraction and classification prediction. The feature extraction is also divided into three sub steps: (1) the PSI-BLAST tool is used to represent each protein sequence and the corresponding PSSMs are obtained; (2) Based on PSSM and physicochemical characteristics, each probabilistic residue is calculated; (3) Each autocorrelation correlation feature vector is established according to Equation (2). Similarly, classification prediction also includes two sub steps. (1) As described before, each dataset is divided into training set and test set. The training set is used to train the DVM model; (2) the trained DVM model is employed to predict the PPIs on the Yeast and H. pylori datasets and the performance of the algorithm is evaluated. Similarly, the SVM model is also constructed for predicting PPIs on the Yeast dataset. The flow chart of our proposed approach is illustrated as Figure 2.


Highly Accurate Prediction of Protein-Protein Interactions via Incorporating Evolutionary Information and Physicochemical Characteristics
The flow chart of the proposed method.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5037676&req=5

ijms-17-01396-f002: The flow chart of the proposed method.
Mentions: In this study, the procedure of the proposed approach mainly consists of two steps: feature extraction and classification prediction. The feature extraction is also divided into three sub steps: (1) the PSI-BLAST tool is used to represent each protein sequence and the corresponding PSSMs are obtained; (2) Based on PSSM and physicochemical characteristics, each probabilistic residue is calculated; (3) Each autocorrelation correlation feature vector is established according to Equation (2). Similarly, classification prediction also includes two sub steps. (1) As described before, each dataset is divided into training set and test set. The training set is used to train the DVM model; (2) the trained DVM model is employed to predict the PPIs on the Yeast and H. pylori datasets and the performance of the algorithm is evaluated. Similarly, the SVM model is also constructed for predicting PPIs on the Yeast dataset. The flow chart of our proposed approach is illustrated as Figure 2.

View Article: PubMed Central - PubMed

ABSTRACT

Protein-protein interactions (PPIs) occur at almost all levels of cell functions and play crucial roles in various cellular processes. Thus, identification of PPIs is critical for deciphering the molecular mechanisms and further providing insight into biological processes. Although a variety of high-throughput experimental techniques have been developed to identify PPIs, existing PPI pairs by experimental approaches only cover a small fraction of the whole PPI networks, and further, those approaches hold inherent disadvantages, such as being time-consuming, expensive, and having high false positive rate. Therefore, it is urgent and imperative to develop automatic in silico approaches to predict PPIs efficiently and accurately. In this article, we propose a novel mixture of physicochemical and evolutionary-based feature extraction method for predicting PPIs using our newly developed discriminative vector machine (DVM) classifier. The improvements of the proposed method mainly consist in introducing an effective feature extraction method that can capture discriminative features from the evolutionary-based information and physicochemical characteristics, and then a powerful and robust DVM classifier is employed. To the best of our knowledge, it is the first time that DVM model is applied to the field of bioinformatics. When applying the proposed method to the Yeast and Helicobacter pylori (H. pylori) datasets, we obtain excellent prediction accuracies of 94.35% and 90.61%, respectively. The computational results indicate that our method is effective and robust for predicting PPIs, and can be taken as a useful supplementary tool to the traditional experimental methods for future proteomics research.

No MeSH data available.