Unmasking determinants of specificity in the human kinome.
Here, we systematically discover several DoS and experimentally validate three of them, named the αC1, αC3, and APE-7 residues.We demonstrate that DoS form sparse networks of non-conserved residues spanning distant regions.Our results reveal a likely role for inter-residue allostery in specificity and an evolutionary decoupling of kinase activity and specificity, which appear loaded on independent groups of residues.
Affiliation: Department of Systems Biology, Technical University of Denmark, 2800 Lyngby, Denmark. Electronic address: email@example.com.
- Protein Kinases/chemistry*/metabolism*
- Computational Biology
- Models, Molecular
- Substrate Specificity
- src Homology Domains
© Copyright Policy
- CC BY
figs2: Related to Figure 2(A) Alpha Determination for Kinase KINspect. As explained in Experimental Procedures, a parameter ‘alpha’ (α) needs to be optimized to determine the best trade-off between using only the most similar domains or include more distant domains when predicting new PSSMs. In essence, the procedure described for KINspect in Figure 2 is performed using different alphas and the alpha leading to the best performance is chosen. As shown here, the best results (lower prediction error) were obtained with α = 3, thus this value was used subsequently. Even though, in line with standard nomenclature for genetic algorithm, we have labeled the y axis as being “Fitness,” it is important to clarify that KINspect evolves by minimization the error in predictions, therefore “minimizing fitness.” This “Fitness” is measured as the median Frobenius distance between predicted and experimentally determined PSSMs.(B) KINspect fitness trajectories. When trained on the human kinome, KINspect reaches convergence after approximately 2000–2500 generations. Fitness is measured as the median Frobenius distance between predicted and observed PSSMs. Each color in this plot shows the fitness of the best mask at each generation. The similarity between the different trajectories representing the 10 independent KINspect evaluation runs confirms they have followed a similar path to convergence.(C) KINspect convergence, robustness and performance. In order to evaluate whether similar results are obtained in the 10 independent KINspect evaluations, the best mask for each run is compared to all the others at each generation and their dissimilarity is measured as the Frobenius distance between the vectors. By including box-plots every 500 generations, we could also assess the evolution of the overall distribution. The graph illustrates the increase in similarity (decrease in dissimilarity) of results as one moves closer to the final point of convergence. From this, one can conclude that independent algorithm deployments tend to converge to the same (or at least highly similar) solution. One can further appreciate the similarity corresponding to this Frobenius distance by referring to (C), where the scores of two masks at this distance are represented pair-wise.(D) By comparing two of the final specificity masks obtained in two independent KINspect evaluations, we could compare the score of the two masks at the same kinase domain positions. This distribution shows a large degree of agreement (e.g., residues scoring 1 in one masks have a high tendency to score 1 in the other one) between the two final masks obtained in two independent KINspect evaluation runs, as well as a strong tendency for most residues to score 0 in both runs.(E) KINspect coverage. Overview of the predictive performance of KINspect for different human kinase domains. A larger bar indicates higher (better) predictive performance, while a shorter bar indicates lower (worse) predictive performance. For more clarity, bars have been colored in dark, light blue, orange or red (predictive performance below the percentile 25, below the median, above the median or above the percentile 75, respectively).
Since our method contains stochastic aspects (such as the starting set of random masks and the generation of new masks by mutation and cross-over), one initial question that must be addressed is whether the method is robust to this initial stochasticity, i.e., whether one would obtain similar results if the process was started with arbitrary initial conditions and evaluated independently several times. To this end, we compared the fitness evolution of ten independent KINspect evaluations and found highly comparable fitness trajectories, as well as increasing similarity between the best-performing masks at each generation (Figure S2; Data S1, S2, and S3). Moreover, we confirmed that the results are not simply due to trivial technical factors, such as residue conservation or alignment gaps (Figure S3), and that similar results could not be obtained using uniform or randomized sets (Figure S3). Taken together, these results demonstrate that KINspect is robust to arbitrary initial conditions and converges to a limited set of highly similar solutions (specificity masks, Figure S3).