Limits...
Improving structural similarity based virtual screening using background knowledge.

Girschick T, Puchbauer L, Kramer S - J Cheminform (2013)

Bottom Line: The inclusion of binding-relevant background knowledge into a structural similarity measure improves the quality of the similarity rankings.Our study shows that adding binding relevant background knowledge can lead to significantly improved similarity rankings in virtual screening and that even basic data mining approaches can lead to competitive results making hand-selection of the background knowledge less crucial.This is especially important in drug discovery and development projects where no receptor structure is available or more frequently no verified binding mode is known and mostly ligand based approaches can be applied to generate hit compounds.

View Article: PubMed Central - HTML - PubMed

Affiliation: Johannes Gutenberg-Universität Mainz, Institut für Informatik, Staudingerweg 9, 55128 Mainz, Germany. kramer@informatik.uni-mainz.de.

ABSTRACT

Background: Virtual screening in the form of similarity rankings is often applied in the early drug discovery process to rank and prioritize compounds from a database. This similarity ranking can be achieved with structural similarity measures. However, their general nature can lead to insufficient performance in some application cases. In this paper, we provide a link between ranking-based virtual screening and fragment-based data mining methods. The inclusion of binding-relevant background knowledge into a structural similarity measure improves the quality of the similarity rankings. This background knowledge in the form of binding relevant substructures can either be derived by hand selection or by automated fragment-based data mining methods.

Results: In virtual screening experiments we show that our approach clearly improves enrichment factors with both applied variants of our approach: the extension of the structural similarity measure with background knowledge in the form of a hand-selected relevant substructure or the extension of the similarity measure with background knowledge derived with data mining methods.

Conclusion: Our study shows that adding binding relevant background knowledge can lead to significantly improved similarity rankings in virtual screening and that even basic data mining approaches can lead to competitive results making hand-selection of the background knowledge less crucial. This is especially important in drug discovery and development projects where no receptor structure is available or more frequently no verified binding mode is known and mostly ligand based approaches can be applied to generate hit compounds.

No MeSH data available.


ZINC4628438. 2D structure depiction of ZINC4628438 from the MCS similarity ranking. Rank difference: ΔRank=11.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3928642&req=5

Figure 13: ZINC4628438. 2D structure depiction of ZINC4628438 from the MCS similarity ranking. Rank difference: ΔRank=11.

Mentions: In the first set of experiments we extract the binding-relevant knowledge used to extend the structural similarity measures by literature review and support the process by MCS similarity ranking and docking calculations. We therefore rank the screening database (including decoys and statin ligands) with respect to fluvastatin using simMCS. Subsequently, we docked the top 25 compounds of the similarity ranking to the HMGR receptor. Looking at the docking results in Table 4 (and the long version in the Additional file 1: Table S1), it can be seen that only one compound (CID 60823) has a good docking score. This is atorvastatin, one of the two statins found in the top 25 of the MCS similarity ranking. All other compounds have rather weak docking scores. Four structures from this ranking are shown in Figures 10, 11, 12, 13 and the docking of the best non-statin is shown in Figure 8B. It can clearly be seen that the highlighted part of the structure of fluvastatin (Figure 3 and Figure 8A) or something structurally similar, is not present in any of the structures (non-statins). According to Istvan et al.[17], this part mimics the original binding ligand and consequently is essential for binding. The hydrophobic part of the statins is responsible for the nano-molar affinity of the statins but not sufficient for inhibitory binding on its own. Taking those facts into consideration, we decided to use the highlighted hydrophilic part of fluvastatin as background knowledge in our study. As described in the Materials and methods Section, the substructure was fragmented and used to derive a binary occurrence fingerprint of length 57 for the extended similarity measure (1).


Improving structural similarity based virtual screening using background knowledge.

Girschick T, Puchbauer L, Kramer S - J Cheminform (2013)

ZINC4628438. 2D structure depiction of ZINC4628438 from the MCS similarity ranking. Rank difference: ΔRank=11.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3928642&req=5

Figure 13: ZINC4628438. 2D structure depiction of ZINC4628438 from the MCS similarity ranking. Rank difference: ΔRank=11.
Mentions: In the first set of experiments we extract the binding-relevant knowledge used to extend the structural similarity measures by literature review and support the process by MCS similarity ranking and docking calculations. We therefore rank the screening database (including decoys and statin ligands) with respect to fluvastatin using simMCS. Subsequently, we docked the top 25 compounds of the similarity ranking to the HMGR receptor. Looking at the docking results in Table 4 (and the long version in the Additional file 1: Table S1), it can be seen that only one compound (CID 60823) has a good docking score. This is atorvastatin, one of the two statins found in the top 25 of the MCS similarity ranking. All other compounds have rather weak docking scores. Four structures from this ranking are shown in Figures 10, 11, 12, 13 and the docking of the best non-statin is shown in Figure 8B. It can clearly be seen that the highlighted part of the structure of fluvastatin (Figure 3 and Figure 8A) or something structurally similar, is not present in any of the structures (non-statins). According to Istvan et al.[17], this part mimics the original binding ligand and consequently is essential for binding. The hydrophobic part of the statins is responsible for the nano-molar affinity of the statins but not sufficient for inhibitory binding on its own. Taking those facts into consideration, we decided to use the highlighted hydrophilic part of fluvastatin as background knowledge in our study. As described in the Materials and methods Section, the substructure was fragmented and used to derive a binary occurrence fingerprint of length 57 for the extended similarity measure (1).

Bottom Line: The inclusion of binding-relevant background knowledge into a structural similarity measure improves the quality of the similarity rankings.Our study shows that adding binding relevant background knowledge can lead to significantly improved similarity rankings in virtual screening and that even basic data mining approaches can lead to competitive results making hand-selection of the background knowledge less crucial.This is especially important in drug discovery and development projects where no receptor structure is available or more frequently no verified binding mode is known and mostly ligand based approaches can be applied to generate hit compounds.

View Article: PubMed Central - HTML - PubMed

Affiliation: Johannes Gutenberg-Universität Mainz, Institut für Informatik, Staudingerweg 9, 55128 Mainz, Germany. kramer@informatik.uni-mainz.de.

ABSTRACT

Background: Virtual screening in the form of similarity rankings is often applied in the early drug discovery process to rank and prioritize compounds from a database. This similarity ranking can be achieved with structural similarity measures. However, their general nature can lead to insufficient performance in some application cases. In this paper, we provide a link between ranking-based virtual screening and fragment-based data mining methods. The inclusion of binding-relevant background knowledge into a structural similarity measure improves the quality of the similarity rankings. This background knowledge in the form of binding relevant substructures can either be derived by hand selection or by automated fragment-based data mining methods.

Results: In virtual screening experiments we show that our approach clearly improves enrichment factors with both applied variants of our approach: the extension of the structural similarity measure with background knowledge in the form of a hand-selected relevant substructure or the extension of the similarity measure with background knowledge derived with data mining methods.

Conclusion: Our study shows that adding binding relevant background knowledge can lead to significantly improved similarity rankings in virtual screening and that even basic data mining approaches can lead to competitive results making hand-selection of the background knowledge less crucial. This is especially important in drug discovery and development projects where no receptor structure is available or more frequently no verified binding mode is known and mostly ligand based approaches can be applied to generate hit compounds.

No MeSH data available.