Limits...
Prediction of 492 human protein kinase substrate specificities.

Safaei J, Maňuch J, Gupta A, Stacho L, Pelech S - Proteome Sci (2011)

Bottom Line: Complex intracellular signaling networks monitor diverse environmental inputs to evoke appropriate and coordinated effector responses.This represents a marked advancement over existing methods such as those used in NetPhorest (179 kinases in 76 groups) and NetworKIN (123 kinases), which consider only positive determinants for kinase substrate prediction.Furthermore for many of the better known kinases, the predicted optimal phosphosite sequences were more accurate than the consensus phosphosite sequences inferred by simple alignment of the phosphosites of known kinase substrates.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, University of British Columbia, Vancouver, Canada. jsafaei@cs.ubc.ca.

ABSTRACT

Background: Complex intracellular signaling networks monitor diverse environmental inputs to evoke appropriate and coordinated effector responses. Defective signal transduction underlies many pathologies, including cancer, diabetes, autoimmunity and about 400 other human diseases. Therefore, there is high impetus to define the composition and architecture of cellular communications networks in humans. The major components of intracellular signaling networks are protein kinases and protein phosphatases, which catalyze the reversible phosphorylation of proteins. Here, we have focused on identification of kinase-substrate interactions through prediction of the phosphorylation site specificity from knowledge of the primary amino acid sequence of the catalytic domain of each kinase.

Results: The presented method predicts 488 different kinase catalytic domain substrate specificity matrices in 478 typical and 4 atypical human kinases that rely on both positive and negative determinants for scoring individual phosphosites for their suitability as kinase substrates. This represents a marked advancement over existing methods such as those used in NetPhorest (179 kinases in 76 groups) and NetworKIN (123 kinases), which consider only positive determinants for kinase substrate prediction. Comparison of our predicted matrices with experimentally-derived matrices from about 9,000 known kinase-phosphosite substrate pairs revealed a high degree of concordance with the established preferences of about 150 well studied protein kinases. Furthermore for many of the better known kinases, the predicted optimal phosphosite sequences were more accurate than the consensus phosphosite sequences inferred by simple alignment of the phosphosites of known kinase substrates.

Conclusions: Application of this improved kinase substrate prediction algorithm to the primary structures of over 23, 000 proteins encoded by the human genome has permitted the identification of about 650, 000 putative phosphosites, which are posted on the open source PhosphoNET website (http://www.phosphonet.ca).

No MeSH data available.


Related in: MedlinePlus

Kinase catalytic domain alignment. Some of the well characterized protein kinases with critical amino acids in their catalytic domains. In the right most column, (–3) position of the consensus sequence of each kinase is shown. Strongly positively charged amino acids (R, K) are represented as blue, weakly positively charged histidine as light blue, strongly negatively charged amino acids (E, D) as red, hydrophobic amino acids (L, V, I, F) as green, and proline (P) as brown.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3379035&req=5

Figure 1: Kinase catalytic domain alignment. Some of the well characterized protein kinases with critical amino acids in their catalytic domains. In the right most column, (–3) position of the consensus sequence of each kinase is shown. Strongly positively charged amino acids (R, K) are represented as blue, weakly positively charged histidine as light blue, strongly negatively charged amino acids (E, D) as red, hydrophobic amino acids (L, V, I, F) as green, and proline (P) as brown.

Mentions: In this section, we present our algorithm for prediction of PSSM matrices based on their catalytic domains. The idea is that those catalytic domains in different kinases which have similar SDRs tend to have similar patterns in the phosphosite regions. To quantify the similarity of catalytic domains of kinases we perform multiple sequence alignment (MSA) of catalytic domains using ClustalW algorithm [19]. The result of the MSA is not quite accurate as it has many gaps, therefore, the alignments were manually modified. We perform this alignment on 488 catalytic domains of the typical protein kinases. The length of each kinase catalytic domain after MSA is 247. For 224 domains in the alignment we compute consensus sequences using 6, 515 confirmed kinase–phosphosite pairs. Figure 1 represents portions of the catalytic domain after MSA of some of the best characterized kinases for which the most phosphosites have been identified. To generate the consensus sequence of each kinase, profile matrix of each kinase is computed using the confirmed phosphosite regions of each kinase. For each position in the consensus sequence the amino acids with the maximum probability in that position is selected. If the probability is bigger than 15% then a capital letter is used to represent that amino acid, if it is less than 15% and bigger than 8%, a small letter is used, and if it is less than 8%, symbol ‘x’ is used in that position of the consensus sequence. ‘x’ here is a ”don’t care” letter and it means that any amino acid can appear in that position of the phosphosite region of a kinase. Therefore, those kinases that have more ‘x’ in their consensus sequence are more general and can phosphorylate more sites than the others. In Figure 2 consensus sequences of some of the well studied kinases are presented.


Prediction of 492 human protein kinase substrate specificities.

Safaei J, Maňuch J, Gupta A, Stacho L, Pelech S - Proteome Sci (2011)

Kinase catalytic domain alignment. Some of the well characterized protein kinases with critical amino acids in their catalytic domains. In the right most column, (–3) position of the consensus sequence of each kinase is shown. Strongly positively charged amino acids (R, K) are represented as blue, weakly positively charged histidine as light blue, strongly negatively charged amino acids (E, D) as red, hydrophobic amino acids (L, V, I, F) as green, and proline (P) as brown.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3379035&req=5

Figure 1: Kinase catalytic domain alignment. Some of the well characterized protein kinases with critical amino acids in their catalytic domains. In the right most column, (–3) position of the consensus sequence of each kinase is shown. Strongly positively charged amino acids (R, K) are represented as blue, weakly positively charged histidine as light blue, strongly negatively charged amino acids (E, D) as red, hydrophobic amino acids (L, V, I, F) as green, and proline (P) as brown.
Mentions: In this section, we present our algorithm for prediction of PSSM matrices based on their catalytic domains. The idea is that those catalytic domains in different kinases which have similar SDRs tend to have similar patterns in the phosphosite regions. To quantify the similarity of catalytic domains of kinases we perform multiple sequence alignment (MSA) of catalytic domains using ClustalW algorithm [19]. The result of the MSA is not quite accurate as it has many gaps, therefore, the alignments were manually modified. We perform this alignment on 488 catalytic domains of the typical protein kinases. The length of each kinase catalytic domain after MSA is 247. For 224 domains in the alignment we compute consensus sequences using 6, 515 confirmed kinase–phosphosite pairs. Figure 1 represents portions of the catalytic domain after MSA of some of the best characterized kinases for which the most phosphosites have been identified. To generate the consensus sequence of each kinase, profile matrix of each kinase is computed using the confirmed phosphosite regions of each kinase. For each position in the consensus sequence the amino acids with the maximum probability in that position is selected. If the probability is bigger than 15% then a capital letter is used to represent that amino acid, if it is less than 15% and bigger than 8%, a small letter is used, and if it is less than 8%, symbol ‘x’ is used in that position of the consensus sequence. ‘x’ here is a ”don’t care” letter and it means that any amino acid can appear in that position of the phosphosite region of a kinase. Therefore, those kinases that have more ‘x’ in their consensus sequence are more general and can phosphorylate more sites than the others. In Figure 2 consensus sequences of some of the well studied kinases are presented.

Bottom Line: Complex intracellular signaling networks monitor diverse environmental inputs to evoke appropriate and coordinated effector responses.This represents a marked advancement over existing methods such as those used in NetPhorest (179 kinases in 76 groups) and NetworKIN (123 kinases), which consider only positive determinants for kinase substrate prediction.Furthermore for many of the better known kinases, the predicted optimal phosphosite sequences were more accurate than the consensus phosphosite sequences inferred by simple alignment of the phosphosites of known kinase substrates.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, University of British Columbia, Vancouver, Canada. jsafaei@cs.ubc.ca.

ABSTRACT

Background: Complex intracellular signaling networks monitor diverse environmental inputs to evoke appropriate and coordinated effector responses. Defective signal transduction underlies many pathologies, including cancer, diabetes, autoimmunity and about 400 other human diseases. Therefore, there is high impetus to define the composition and architecture of cellular communications networks in humans. The major components of intracellular signaling networks are protein kinases and protein phosphatases, which catalyze the reversible phosphorylation of proteins. Here, we have focused on identification of kinase-substrate interactions through prediction of the phosphorylation site specificity from knowledge of the primary amino acid sequence of the catalytic domain of each kinase.

Results: The presented method predicts 488 different kinase catalytic domain substrate specificity matrices in 478 typical and 4 atypical human kinases that rely on both positive and negative determinants for scoring individual phosphosites for their suitability as kinase substrates. This represents a marked advancement over existing methods such as those used in NetPhorest (179 kinases in 76 groups) and NetworKIN (123 kinases), which consider only positive determinants for kinase substrate prediction. Comparison of our predicted matrices with experimentally-derived matrices from about 9,000 known kinase-phosphosite substrate pairs revealed a high degree of concordance with the established preferences of about 150 well studied protein kinases. Furthermore for many of the better known kinases, the predicted optimal phosphosite sequences were more accurate than the consensus phosphosite sequences inferred by simple alignment of the phosphosites of known kinase substrates.

Conclusions: Application of this improved kinase substrate prediction algorithm to the primary structures of over 23, 000 proteins encoded by the human genome has permitted the identification of about 650, 000 putative phosphosites, which are posted on the open source PhosphoNET website (http://www.phosphonet.ca).

No MeSH data available.


Related in: MedlinePlus