Limits...
A grammar inference approach for predicting kinase specific phosphorylation sites.

Datta S, Mukhopadhyay S - PLoS ONE (2015)

Bottom Line: Extensive experiments on several datasets generated by us reveal that, our inferred grammar successfully predicts phosphorylation sites in a kinase specific manner.It performs significantly better when compared with the other existing phosphorylation site prediction methods.We have also compared our inferred DSFA with two other GI inference algorithms.

View Article: PubMed Central - PubMed

Affiliation: Department of Biophysics, Molecular Biology and Bioinformatics and Distributed Information Centre for Bioinformatics, University of Calcutta, Kolkata, West Bengal, India.

ABSTRACT
Kinase mediated phosphorylation site detection is the key mechanism of post translational mechanism that plays an important role in regulating various cellular processes and phenotypes. Many diseases, like cancer are related with the signaling defects which are associated with protein phosphorylation. Characterizing the protein kinases and their substrates enhances our ability to understand the mechanism of protein phosphorylation and extends our knowledge of signaling network; thereby helping us to treat such diseases. Experimental methods for predicting phosphorylation sites are labour intensive and expensive. Also, manifold increase of protein sequences in the databanks over the years necessitates the improvement of high speed and accurate computational methods for predicting phosphorylation sites in protein sequences. Till date, a number of computational methods have been proposed by various researchers in predicting phosphorylation sites, but there remains much scope of improvement. In this communication, we present a simple and novel method based on Grammatical Inference (GI) approach to automate the prediction of kinase specific phosphorylation sites. In this regard, we have used a popular GI algorithm Alergia to infer Deterministic Stochastic Finite State Automata (DSFA) which equally represents the regular grammar corresponding to the phosphorylation sites. Extensive experiments on several datasets generated by us reveal that, our inferred grammar successfully predicts phosphorylation sites in a kinase specific manner. It performs significantly better when compared with the other existing phosphorylation site prediction methods. We have also compared our inferred DSFA with two other GI inference algorithms. The DSFA generated by our method performs superior which indicates that our method is robust and has a potential for predicting the phosphorylation sites in a kinase specific manner.

Show MeSH

Related in: MedlinePlus

The Deterministic Finite State Automaton (DFA) constructed from the PTA by state merging method.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4401752&req=5

pone.0122294.g003: The Deterministic Finite State Automaton (DFA) constructed from the PTA by state merging method.

Mentions: PTA algorithm is an unsupervised relational learner that infers grammars from a set of unlabelled samples. PTA algorithm works by constructing a PTA, which is also a DFA with separate paths from the start state to the final accepting state for each string in S+. Each of the DFA represents an input token (in our case, each amino acid). PTA accepts the strings from S+ only. In the next step, similar states of the PTA are merged in an iterative way until a minimum similarity threshold is reached. In each iteration, the similarity between every two states is calculated and the most similar pair of states is merged to obtain the final minimized DFA. Fig 2 shows the diagram of a PTA and Fig 3 represents the final DFA generated over five sample sequences.


A grammar inference approach for predicting kinase specific phosphorylation sites.

Datta S, Mukhopadhyay S - PLoS ONE (2015)

The Deterministic Finite State Automaton (DFA) constructed from the PTA by state merging method.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4401752&req=5

pone.0122294.g003: The Deterministic Finite State Automaton (DFA) constructed from the PTA by state merging method.
Mentions: PTA algorithm is an unsupervised relational learner that infers grammars from a set of unlabelled samples. PTA algorithm works by constructing a PTA, which is also a DFA with separate paths from the start state to the final accepting state for each string in S+. Each of the DFA represents an input token (in our case, each amino acid). PTA accepts the strings from S+ only. In the next step, similar states of the PTA are merged in an iterative way until a minimum similarity threshold is reached. In each iteration, the similarity between every two states is calculated and the most similar pair of states is merged to obtain the final minimized DFA. Fig 2 shows the diagram of a PTA and Fig 3 represents the final DFA generated over five sample sequences.

Bottom Line: Extensive experiments on several datasets generated by us reveal that, our inferred grammar successfully predicts phosphorylation sites in a kinase specific manner.It performs significantly better when compared with the other existing phosphorylation site prediction methods.We have also compared our inferred DSFA with two other GI inference algorithms.

View Article: PubMed Central - PubMed

Affiliation: Department of Biophysics, Molecular Biology and Bioinformatics and Distributed Information Centre for Bioinformatics, University of Calcutta, Kolkata, West Bengal, India.

ABSTRACT
Kinase mediated phosphorylation site detection is the key mechanism of post translational mechanism that plays an important role in regulating various cellular processes and phenotypes. Many diseases, like cancer are related with the signaling defects which are associated with protein phosphorylation. Characterizing the protein kinases and their substrates enhances our ability to understand the mechanism of protein phosphorylation and extends our knowledge of signaling network; thereby helping us to treat such diseases. Experimental methods for predicting phosphorylation sites are labour intensive and expensive. Also, manifold increase of protein sequences in the databanks over the years necessitates the improvement of high speed and accurate computational methods for predicting phosphorylation sites in protein sequences. Till date, a number of computational methods have been proposed by various researchers in predicting phosphorylation sites, but there remains much scope of improvement. In this communication, we present a simple and novel method based on Grammatical Inference (GI) approach to automate the prediction of kinase specific phosphorylation sites. In this regard, we have used a popular GI algorithm Alergia to infer Deterministic Stochastic Finite State Automata (DSFA) which equally represents the regular grammar corresponding to the phosphorylation sites. Extensive experiments on several datasets generated by us reveal that, our inferred grammar successfully predicts phosphorylation sites in a kinase specific manner. It performs significantly better when compared with the other existing phosphorylation site prediction methods. We have also compared our inferred DSFA with two other GI inference algorithms. The DSFA generated by our method performs superior which indicates that our method is robust and has a potential for predicting the phosphorylation sites in a kinase specific manner.

Show MeSH
Related in: MedlinePlus