Limits...
Combinatorial clustering of residue position subsets predicts inhibitor affinity across the human kinome.

Bryant DH, Moll M, Finn PW, Kavraki LE - PLoS Comput. Biol. (2013)

Bottom Line: The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels.Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases.Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rice University, Houston, Texas, United States of America.

ABSTRACT
The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, ccorps is applied to the problem of identifying structural features of the kinase atp binding site that are informative of inhibitor binding. ccorps is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

Show MeSH

Related in: MedlinePlus

Distribution of phylogenetic and affinity purity cluster scores for VX-680.Each point in the scatter plot above marks the purity for the drug affinity true label on the -axis and the phylogenetic label purity on the -axis. For example, a point above located at the coordinates  denotes a cluster that is 100% pure in the true drug affinity label (for VX-680 in this case) but is only 20% pure in the most common phylogenetic label present; that is, this cluster indicates one instance of structural similarity among phylogenetically diverse proteins that also coincides with having affinity for VX-680. Conversely, a point at the coordinates  indicates a cluster that contains only structures from one phylogenetic (family-level) branch but contains an equal proportion of true and false affinity labels; that is, a case where structurally similar, closely related (phylogenetically) structures have different affinities for VX-680. Each point is semi-transparent so that darker areas in the plot indicate a higher density of points.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3675009&req=5

pcbi-1003087-g007: Distribution of phylogenetic and affinity purity cluster scores for VX-680.Each point in the scatter plot above marks the purity for the drug affinity true label on the -axis and the phylogenetic label purity on the -axis. For example, a point above located at the coordinates denotes a cluster that is 100% pure in the true drug affinity label (for VX-680 in this case) but is only 20% pure in the most common phylogenetic label present; that is, this cluster indicates one instance of structural similarity among phylogenetically diverse proteins that also coincides with having affinity for VX-680. Conversely, a point at the coordinates indicates a cluster that contains only structures from one phylogenetic (family-level) branch but contains an equal proportion of true and false affinity labels; that is, a case where structurally similar, closely related (phylogenetically) structures have different affinities for VX-680. Each point is semi-transparent so that darker areas in the plot indicate a higher density of points.

Mentions: Each individual cluster, across all 2925 clusterings and all 38 inhibitors, was evaluated to calculate the purity of both affinity labels and family-level phylogenetic labels. For example, a cluster containing 3 distinct kinase sequences with affinity labels and family labels {agc, camk, tk} would have an affinity purity of and a phylogenetic purity of 0.33. By plotting the affinity and phylogenetic purity scores of each cluster (separately for each inhibitor) as shown in Fig. 6, the distribution of clusters across the spectrum of possible scores can be evaluated. Note that only the clusters having a true label majority are plotted in Fig. 7 (i.e., a true label majority is purity in the true label). Additionally, Table 2 lists per inhibitor statistics for cluster distributions shown in Fig. 7.


Combinatorial clustering of residue position subsets predicts inhibitor affinity across the human kinome.

Bryant DH, Moll M, Finn PW, Kavraki LE - PLoS Comput. Biol. (2013)

Distribution of phylogenetic and affinity purity cluster scores for VX-680.Each point in the scatter plot above marks the purity for the drug affinity true label on the -axis and the phylogenetic label purity on the -axis. For example, a point above located at the coordinates  denotes a cluster that is 100% pure in the true drug affinity label (for VX-680 in this case) but is only 20% pure in the most common phylogenetic label present; that is, this cluster indicates one instance of structural similarity among phylogenetically diverse proteins that also coincides with having affinity for VX-680. Conversely, a point at the coordinates  indicates a cluster that contains only structures from one phylogenetic (family-level) branch but contains an equal proportion of true and false affinity labels; that is, a case where structurally similar, closely related (phylogenetically) structures have different affinities for VX-680. Each point is semi-transparent so that darker areas in the plot indicate a higher density of points.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3675009&req=5

pcbi-1003087-g007: Distribution of phylogenetic and affinity purity cluster scores for VX-680.Each point in the scatter plot above marks the purity for the drug affinity true label on the -axis and the phylogenetic label purity on the -axis. For example, a point above located at the coordinates denotes a cluster that is 100% pure in the true drug affinity label (for VX-680 in this case) but is only 20% pure in the most common phylogenetic label present; that is, this cluster indicates one instance of structural similarity among phylogenetically diverse proteins that also coincides with having affinity for VX-680. Conversely, a point at the coordinates indicates a cluster that contains only structures from one phylogenetic (family-level) branch but contains an equal proportion of true and false affinity labels; that is, a case where structurally similar, closely related (phylogenetically) structures have different affinities for VX-680. Each point is semi-transparent so that darker areas in the plot indicate a higher density of points.
Mentions: Each individual cluster, across all 2925 clusterings and all 38 inhibitors, was evaluated to calculate the purity of both affinity labels and family-level phylogenetic labels. For example, a cluster containing 3 distinct kinase sequences with affinity labels and family labels {agc, camk, tk} would have an affinity purity of and a phylogenetic purity of 0.33. By plotting the affinity and phylogenetic purity scores of each cluster (separately for each inhibitor) as shown in Fig. 6, the distribution of clusters across the spectrum of possible scores can be evaluated. Note that only the clusters having a true label majority are plotted in Fig. 7 (i.e., a true label majority is purity in the true label). Additionally, Table 2 lists per inhibitor statistics for cluster distributions shown in Fig. 7.

Bottom Line: The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels.Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases.Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Rice University, Houston, Texas, United States of America.

ABSTRACT
The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, ccorps is applied to the problem of identifying structural features of the kinase atp binding site that are informative of inhibitor binding. ccorps is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.

Show MeSH
Related in: MedlinePlus