Limits...
Combinatorial clustering of residue position subsets predicts inhibitor affinity across the human kinome.

Bryant DH, Moll M, Finn PW, Kavraki LE - PLoS Comput. Biol. (2013)

Bottom Line: The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels.Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases.Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rice University, Houston, Texas, United States of America.

ABSTRACT
The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, ccorps is applied to the problem of identifying structural features of the kinase atp binding site that are informative of inhibitor binding. ccorps is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

Show MeSH

Related in: MedlinePlus

Decision boundary for label vote vectors computed by SVM.In the above scatter plot, each point corresponds to the number of true/false votes accumulated by each substructure across all clusterings. Combining the above label vote vectors with the known labels for substructures to train an svm (using linear kernel) results in the decision boundary shown as the bold black line. The red and blue regions (right and left sides of the boundary, respectively) denote the values for which the predicted label will be false and true, respectively. Blue points indicate substructures known to have the true label while red points denote the false label. In the case of Roscovitine above, wide separation between the two classes exists.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3675009&req=5

pcbi-1003087-g002: Decision boundary for label vote vectors computed by SVM.In the above scatter plot, each point corresponds to the number of true/false votes accumulated by each substructure across all clusterings. Combining the above label vote vectors with the known labels for substructures to train an svm (using linear kernel) results in the decision boundary shown as the bold black line. The red and blue regions (right and left sides of the boundary, respectively) denote the values for which the predicted label will be false and true, respectively. Blue points indicate substructures known to have the true label while red points denote the false label. In the case of Roscovitine above, wide separation between the two classes exists.

Mentions: Because ccorps is a semi-supervised approach, the labels for the training structures are known and can be used to empirically estimate a vote count decision boundary. For example, given structure with known label, the number of times that appeared in a falsehpc or a truehpc, across all -position subsets, can be calculated using the same approach as for unlabeled structures. The structure is then represented by an -dimensional vote vector, where each of the dimensions corresponds to the number of votes received for label ( for the case of kinase binding affinity, since we only have false and true labels). Application of this procedure to all labeled structures in the dataset provides an empirical basis for calculating a decision boundary in the vote space given the vote distribution for labeled structures. For example, the blue and red points shown in the scatter plot of Fig. 2 denote the vote vectors for training set substructures with known true and false labels, respectively.


Combinatorial clustering of residue position subsets predicts inhibitor affinity across the human kinome.

Bryant DH, Moll M, Finn PW, Kavraki LE - PLoS Comput. Biol. (2013)

Decision boundary for label vote vectors computed by SVM.In the above scatter plot, each point corresponds to the number of true/false votes accumulated by each substructure across all clusterings. Combining the above label vote vectors with the known labels for substructures to train an svm (using linear kernel) results in the decision boundary shown as the bold black line. The red and blue regions (right and left sides of the boundary, respectively) denote the values for which the predicted label will be false and true, respectively. Blue points indicate substructures known to have the true label while red points denote the false label. In the case of Roscovitine above, wide separation between the two classes exists.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3675009&req=5

pcbi-1003087-g002: Decision boundary for label vote vectors computed by SVM.In the above scatter plot, each point corresponds to the number of true/false votes accumulated by each substructure across all clusterings. Combining the above label vote vectors with the known labels for substructures to train an svm (using linear kernel) results in the decision boundary shown as the bold black line. The red and blue regions (right and left sides of the boundary, respectively) denote the values for which the predicted label will be false and true, respectively. Blue points indicate substructures known to have the true label while red points denote the false label. In the case of Roscovitine above, wide separation between the two classes exists.
Mentions: Because ccorps is a semi-supervised approach, the labels for the training structures are known and can be used to empirically estimate a vote count decision boundary. For example, given structure with known label, the number of times that appeared in a falsehpc or a truehpc, across all -position subsets, can be calculated using the same approach as for unlabeled structures. The structure is then represented by an -dimensional vote vector, where each of the dimensions corresponds to the number of votes received for label ( for the case of kinase binding affinity, since we only have false and true labels). Application of this procedure to all labeled structures in the dataset provides an empirical basis for calculating a decision boundary in the vote space given the vote distribution for labeled structures. For example, the blue and red points shown in the scatter plot of Fig. 2 denote the vote vectors for training set substructures with known true and false labels, respectively.

Bottom Line: The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels.Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases.Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rice University, Houston, Texas, United States of America.

ABSTRACT
The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, ccorps is applied to the problem of identifying structural features of the kinase atp binding site that are informative of inhibitor binding. ccorps is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

Show MeSH
Related in: MedlinePlus