Limits...
Similarity maps - a visualization strategy for molecular fingerprints and machine-learning methods.

Riniker S, Landrum GA - J Cheminform (2013)

Bottom Line: Similarity is an appealing approach because, with many fingerprint types, it provides intuitive results: a chemist looking at two molecules can understand why they have been determined to be similar.Here we present similarity maps, a straightforward and general strategy to visualize the atomic contributions to the similarity between two molecules or the predicted probability of a ML model.We show the application of similarity maps to a set of dopamine D3 receptor ligands using atom-pair and circular fingerprints as well as two popular ML methods: random forests and naïve Bayes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Novartis Institutes for BioMedical Research, Basel, Switzerland. gregory.landrum@novartis.com.

ABSTRACT
: Fingerprint similarity is a common method for comparing chemical structures. Similarity is an appealing approach because, with many fingerprint types, it provides intuitive results: a chemist looking at two molecules can understand why they have been determined to be similar. This transparency is partially lost with the fuzzier similarity methods that are often used for scaffold hopping and tends to vanish completely when molecular fingerprints are used as inputs to machine-learning (ML) models. Here we present similarity maps, a straightforward and general strategy to visualize the atomic contributions to the similarity between two molecules or the predicted probability of a ML model. We show the application of similarity maps to a set of dopamine D3 receptor ligands using atom-pair and circular fingerprints as well as two popular ML methods: random forests and naïve Bayes. An open-source implementation of the method is provided.

No MeSH data available.


Related in: MedlinePlus

Similarity maps for atom-pairs (AP) fingerprint. Similarity map of molecule 2 (middle) and molecule 3 (right) using AP. The reference compound is molecule 1 (left). Color scheme: removing bits decreases similarity (i.e. positive difference) (green), no change in similarity (gray), removing bits increases similarity (i.e. negative difference) (pink). The default maximum path length of 30 was used for AP.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3852750&req=5

Figure 2: Similarity maps for atom-pairs (AP) fingerprint. Similarity map of molecule 2 (middle) and molecule 3 (right) using AP. The reference compound is molecule 1 (left). Color scheme: removing bits decreases similarity (i.e. positive difference) (green), no change in similarity (gray), removing bits increases similarity (i.e. negative difference) (pink). The default maximum path length of 30 was used for AP.

Mentions: The similarity maps of molecules 2 and 3 using the AP fingerprint are shown in Figure2. An atom in the AP fingerprint sees all other atoms (if the path is maximum 30 bonds). Atoms with green weights have a majority of paths which are also in the reference compound; deleting them from the fingerprint reduces the similarity to the reference compound. The similarity maps in Figure2 are consistent with our expectations. For molecule 2, atoms in the phenyl rings, the piperazine moiety and the alkyl linker were found important for similarity, whereas removing the bits of the nitrogens in the quinoxaline moiety, the oxygen in the benzofuran moiety, or the amide increased the similarity. Also for molecule 3, atoms in the alkyl linker and partly in the piperazine moiety were found to be most important for similarity.


Similarity maps - a visualization strategy for molecular fingerprints and machine-learning methods.

Riniker S, Landrum GA - J Cheminform (2013)

Similarity maps for atom-pairs (AP) fingerprint. Similarity map of molecule 2 (middle) and molecule 3 (right) using AP. The reference compound is molecule 1 (left). Color scheme: removing bits decreases similarity (i.e. positive difference) (green), no change in similarity (gray), removing bits increases similarity (i.e. negative difference) (pink). The default maximum path length of 30 was used for AP.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3852750&req=5

Figure 2: Similarity maps for atom-pairs (AP) fingerprint. Similarity map of molecule 2 (middle) and molecule 3 (right) using AP. The reference compound is molecule 1 (left). Color scheme: removing bits decreases similarity (i.e. positive difference) (green), no change in similarity (gray), removing bits increases similarity (i.e. negative difference) (pink). The default maximum path length of 30 was used for AP.
Mentions: The similarity maps of molecules 2 and 3 using the AP fingerprint are shown in Figure2. An atom in the AP fingerprint sees all other atoms (if the path is maximum 30 bonds). Atoms with green weights have a majority of paths which are also in the reference compound; deleting them from the fingerprint reduces the similarity to the reference compound. The similarity maps in Figure2 are consistent with our expectations. For molecule 2, atoms in the phenyl rings, the piperazine moiety and the alkyl linker were found important for similarity, whereas removing the bits of the nitrogens in the quinoxaline moiety, the oxygen in the benzofuran moiety, or the amide increased the similarity. Also for molecule 3, atoms in the alkyl linker and partly in the piperazine moiety were found to be most important for similarity.

Bottom Line: Similarity is an appealing approach because, with many fingerprint types, it provides intuitive results: a chemist looking at two molecules can understand why they have been determined to be similar.Here we present similarity maps, a straightforward and general strategy to visualize the atomic contributions to the similarity between two molecules or the predicted probability of a ML model.We show the application of similarity maps to a set of dopamine D3 receptor ligands using atom-pair and circular fingerprints as well as two popular ML methods: random forests and naïve Bayes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Novartis Institutes for BioMedical Research, Basel, Switzerland. gregory.landrum@novartis.com.

ABSTRACT
: Fingerprint similarity is a common method for comparing chemical structures. Similarity is an appealing approach because, with many fingerprint types, it provides intuitive results: a chemist looking at two molecules can understand why they have been determined to be similar. This transparency is partially lost with the fuzzier similarity methods that are often used for scaffold hopping and tends to vanish completely when molecular fingerprints are used as inputs to machine-learning (ML) models. Here we present similarity maps, a straightforward and general strategy to visualize the atomic contributions to the similarity between two molecules or the predicted probability of a ML model. We show the application of similarity maps to a set of dopamine D3 receptor ligands using atom-pair and circular fingerprints as well as two popular ML methods: random forests and naïve Bayes. An open-source implementation of the method is provided.

No MeSH data available.


Related in: MedlinePlus