Limits...
PubChem structure-activity relationship (SAR) clusters.

Kim S, Han L, Yu B, Hähnke VD, Bolton EE, Bryant SH - J Cheminform (2015)

Bottom Line: The resulting 18 million clusters, named "PubChem SAR clusters", were delivered in such a way that each cluster contains a group of small molecules similar to each other in both structure and bioactivity.Each SAR cluster can be a useful resource in developing a meaningful SAR or enable one to design or expand compound libraries from the cluster.It can also help to predict the potential therapeutic effects and pharmacological actions of less-known compounds from those of well-known compounds (i.e., drugs) in the same cluster.

View Article: PubMed Central - PubMed

Affiliation: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD 20894 USA.

ABSTRACT

Background: Developing structure-activity relationships (SARs) of molecules is an important approach in facilitating hit exploration in the early stage of drug discovery. Although information on millions of compounds and their bioactivities is freely available to the public, it is very challenging to infer a meaningful and novel SAR from that information.

Results: Research discussed in the present paper employed a bioactivity-centered clustering approach to group 843,845 non-inactive compounds stored in PubChem according to both structural similarity and bioactivity similarity, with the aim of mining bioactivity data in PubChem for useful SAR information. The compounds were clustered in three bioactivity similarity contexts: (1) non-inactive in a given bioassay, (2) non-inactive against a given protein, and (3) non-inactive against proteins involved in a given pathway. In each context, these small molecules were clustered according to their two-dimensional (2-D) and three-dimensional (3-D) structural similarities. The resulting 18 million clusters, named "PubChem SAR clusters", were delivered in such a way that each cluster contains a group of small molecules similar to each other in both structure and bioactivity.

Conclusions: The PubChem SAR clusters, pre-computed using publicly available bioactivity information, make it possible to quickly navigate and narrow down the compounds of interest. Each SAR cluster can be a useful resource in developing a meaningful SAR or enable one to design or expand compound libraries from the cluster. It can also help to predict the potential therapeutic effects and pharmacological actions of less-known compounds from those of well-known compounds (i.e., drugs) in the same cluster.

No MeSH data available.


Related in: MedlinePlus

The number of the PubChem SAR clusters with high-value compounds (HVCs). The HVCs have high potencies (blue), MeSH annotations (red), or “Pharmacological Action” annotations (green). Panelsa, b, and c are for assay-, protein-, and pathway-centric clusters. Numbers in parentheses indicate the percentages relative to the respective total cluster counts.
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4492103&req=5

Fig6: The number of the PubChem SAR clusters with high-value compounds (HVCs). The HVCs have high potencies (blue), MeSH annotations (red), or “Pharmacological Action” annotations (green). Panelsa, b, and c are for assay-, protein-, and pathway-centric clusters. Numbers in parentheses indicate the percentages relative to the respective total cluster counts.

Mentions: Figure 6 shows the number of clusters with HVCs for the assay-, protein-, and pathway-centric clusters. Among the 9.9 million assay-centric clusters, 43.0% (4.3 million) of them contained HVCs. The fraction of clusters containing HVCs in the protein- and pathway-centric clusters were 49.5% (1.2 million of 2.5 million clusters) and 50.9% (3.1 million of 6.2 million clusters), respectively. The clusters that have high-potency HVCs (with IC50 or EC50 values smaller than 10 μM) correspond to 28.1, 40.1, and 33.8% of the total for the assay-, protein-, and pathway-centric clusters, respectively. The clusters that have MeSH-annotation HVCs were 20.0, 20.1 and 25.7% of the total for assay-, protein-, and pathway-centric clusters, respectively. Figure 7 depicts the distribution of the number of HVCs per cluster, and the summary statistics are listed in Table 4. Some clusters have as many as hundreds of HVCs, but most clusters have only a few HVCs. On average, for example, the assay-centric clusters have 1.3 HVCs with high potency, 0.5 HVCs with MeSH, and 0.3 HVCs with Pharmacological Action annotation.Figure 6


PubChem structure-activity relationship (SAR) clusters.

Kim S, Han L, Yu B, Hähnke VD, Bolton EE, Bryant SH - J Cheminform (2015)

The number of the PubChem SAR clusters with high-value compounds (HVCs). The HVCs have high potencies (blue), MeSH annotations (red), or “Pharmacological Action” annotations (green). Panelsa, b, and c are for assay-, protein-, and pathway-centric clusters. Numbers in parentheses indicate the percentages relative to the respective total cluster counts.
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4492103&req=5

Fig6: The number of the PubChem SAR clusters with high-value compounds (HVCs). The HVCs have high potencies (blue), MeSH annotations (red), or “Pharmacological Action” annotations (green). Panelsa, b, and c are for assay-, protein-, and pathway-centric clusters. Numbers in parentheses indicate the percentages relative to the respective total cluster counts.
Mentions: Figure 6 shows the number of clusters with HVCs for the assay-, protein-, and pathway-centric clusters. Among the 9.9 million assay-centric clusters, 43.0% (4.3 million) of them contained HVCs. The fraction of clusters containing HVCs in the protein- and pathway-centric clusters were 49.5% (1.2 million of 2.5 million clusters) and 50.9% (3.1 million of 6.2 million clusters), respectively. The clusters that have high-potency HVCs (with IC50 or EC50 values smaller than 10 μM) correspond to 28.1, 40.1, and 33.8% of the total for the assay-, protein-, and pathway-centric clusters, respectively. The clusters that have MeSH-annotation HVCs were 20.0, 20.1 and 25.7% of the total for assay-, protein-, and pathway-centric clusters, respectively. Figure 7 depicts the distribution of the number of HVCs per cluster, and the summary statistics are listed in Table 4. Some clusters have as many as hundreds of HVCs, but most clusters have only a few HVCs. On average, for example, the assay-centric clusters have 1.3 HVCs with high potency, 0.5 HVCs with MeSH, and 0.3 HVCs with Pharmacological Action annotation.Figure 6

Bottom Line: The resulting 18 million clusters, named "PubChem SAR clusters", were delivered in such a way that each cluster contains a group of small molecules similar to each other in both structure and bioactivity.Each SAR cluster can be a useful resource in developing a meaningful SAR or enable one to design or expand compound libraries from the cluster.It can also help to predict the potential therapeutic effects and pharmacological actions of less-known compounds from those of well-known compounds (i.e., drugs) in the same cluster.

View Article: PubMed Central - PubMed

Affiliation: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, 8600 Rockville Pike, Bethesda, MD 20894 USA.

ABSTRACT

Background: Developing structure-activity relationships (SARs) of molecules is an important approach in facilitating hit exploration in the early stage of drug discovery. Although information on millions of compounds and their bioactivities is freely available to the public, it is very challenging to infer a meaningful and novel SAR from that information.

Results: Research discussed in the present paper employed a bioactivity-centered clustering approach to group 843,845 non-inactive compounds stored in PubChem according to both structural similarity and bioactivity similarity, with the aim of mining bioactivity data in PubChem for useful SAR information. The compounds were clustered in three bioactivity similarity contexts: (1) non-inactive in a given bioassay, (2) non-inactive against a given protein, and (3) non-inactive against proteins involved in a given pathway. In each context, these small molecules were clustered according to their two-dimensional (2-D) and three-dimensional (3-D) structural similarities. The resulting 18 million clusters, named "PubChem SAR clusters", were delivered in such a way that each cluster contains a group of small molecules similar to each other in both structure and bioactivity.

Conclusions: The PubChem SAR clusters, pre-computed using publicly available bioactivity information, make it possible to quickly navigate and narrow down the compounds of interest. Each SAR cluster can be a useful resource in developing a meaningful SAR or enable one to design or expand compound libraries from the cluster. It can also help to predict the potential therapeutic effects and pharmacological actions of less-known compounds from those of well-known compounds (i.e., drugs) in the same cluster.

No MeSH data available.


Related in: MedlinePlus