Limits...
Machine Learning Models and Pathway Genome Data Base for Trypanosoma cruzi Drug Discovery.

Ekins S, de Siqueira-Neto JL, McCall LI, Sarker M, Yadav M, Ponder EL, Kallel EA, Kellar D, Chen S, Arkin M, Bunin BA, McKerrow JH, Talcott C - PLoS Negl Trop Dis (2015)

Bottom Line: Ninety-seven compounds were selected for in vitro testing, and 11 of these were found to have EC50 < 10 μM.We have demonstrated how combining chemoinformatics and bioinformatics for T. cruzi drug discovery can bring interesting in vivo active molecules to light that may have been overlooked.The approach we have taken is broadly applicable to other NTDs.

View Article: PubMed Central - PubMed

Affiliation: Collaborative Drug Discovery, Burlingame, California, United States of America; Collaborations in Chemistry, Fuquay-Varina, North Carolina, United States of America.

ABSTRACT

Background: Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The current clinical and preclinical pipeline for T. cruzi is extremely sparse and lacks drug target diversity.

Methodology/principal findings: In the present study we developed a computational approach that utilized data from several public whole-cell, phenotypic high throughput screens that have been completed for T. cruzi by the Broad Institute, including a single screen of over 300,000 molecules in the search for chemical probes as part of the NIH Molecular Libraries program. We have also compiled and curated relevant biological and chemical compound screening data including (i) compounds and biological activity data from the literature, (ii) high throughput screening datasets, and (iii) predicted metabolites of T. cruzi metabolic pathways. This information was used to help us identify compounds and their potential targets. We have constructed a Pathway Genome Data Base for T. cruzi. In addition, we have developed Bayesian machine learning models that were used to virtually screen libraries of compounds. Ninety-seven compounds were selected for in vitro testing, and 11 of these were found to have EC50 < 10 μM. We progressed five compounds to an in vivo mouse efficacy model of Chagas disease and validated that the machine learning model could identify in vitro active compounds not in the training set, as well as known positive controls. The antimalarial pyronaridine possessed 85.2% efficacy in the acute Chagas mouse model. We have also proposed potential targets (for future verification) for this compound based on structural similarity to known compounds with targets in T. cruzi.

Conclusions/ significance: We have demonstrated how combining chemoinformatics and bioinformatics for T. cruzi drug discovery can bring interesting in vivo active molecules to light that may have been overlooked. The approach we have taken is broadly applicable to other NTDs.

No MeSH data available.


Related in: MedlinePlus

An example showing the CDD Vault for this collaboration, illustrating how the structures and biology data can be securely shared.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4482694&req=5

pntd.0003878.g003: An example showing the CDD Vault for this collaboration, illustrating how the structures and biology data can be securely shared.

Mentions: This study made wide use of public datasets in CDD as well as the collaborative sharing of data in the CDD Vault. We have also highlighted how the in vivo transgenic T.cruzi Brazil luc strain expressing firefly luciferase data can be stored in the software (Fig 3). These data will ultimately be made publically accessible in this format alongside the datasets we have already made public. In the process of this study we have curated T. cruzi data, constructed a Pathway Genome Data Base for T. cruzi (Fig 1), developed multiple Bayesian machine learning models, tested molecules in vitro and in vivo as well as proposed potential targets for one of the in vivo active compounds. In the process we have identified pyronaridine as having promising in vivo activity in the mouse model of Chagas disease. Future studies will evaluate efficacy in longer term models and identify the target or targets of this molecule. The approaches taken are broadly applicable to other NTDs and extend our prior work with Mtb [42,43,46,47,56–63]. Leveraging published data to create additional resources and models for either re-mining known or new datasets to suggest compounds that can be rapidly progressed all the way through to in vivo animal models, may lead to new clinical studies in a shorter time scale. There are many steps we could take to update our computational models such as incorporating the current data and using other machine learning algorithms. If we can in future narrow down the list of possible targets computationally as well and accelerate experimental target validation that will also be of importance. The combination of computational and experimental approaches represents a multistep workflow (S8 Fig) that was undertaken in this study that could be applicable in any NTD drug discovery project. Efforts to automate, streamline and learn from the resulting data would further increase the efficiency of the approach we have described.


Machine Learning Models and Pathway Genome Data Base for Trypanosoma cruzi Drug Discovery.

Ekins S, de Siqueira-Neto JL, McCall LI, Sarker M, Yadav M, Ponder EL, Kallel EA, Kellar D, Chen S, Arkin M, Bunin BA, McKerrow JH, Talcott C - PLoS Negl Trop Dis (2015)

An example showing the CDD Vault for this collaboration, illustrating how the structures and biology data can be securely shared.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4482694&req=5

pntd.0003878.g003: An example showing the CDD Vault for this collaboration, illustrating how the structures and biology data can be securely shared.
Mentions: This study made wide use of public datasets in CDD as well as the collaborative sharing of data in the CDD Vault. We have also highlighted how the in vivo transgenic T.cruzi Brazil luc strain expressing firefly luciferase data can be stored in the software (Fig 3). These data will ultimately be made publically accessible in this format alongside the datasets we have already made public. In the process of this study we have curated T. cruzi data, constructed a Pathway Genome Data Base for T. cruzi (Fig 1), developed multiple Bayesian machine learning models, tested molecules in vitro and in vivo as well as proposed potential targets for one of the in vivo active compounds. In the process we have identified pyronaridine as having promising in vivo activity in the mouse model of Chagas disease. Future studies will evaluate efficacy in longer term models and identify the target or targets of this molecule. The approaches taken are broadly applicable to other NTDs and extend our prior work with Mtb [42,43,46,47,56–63]. Leveraging published data to create additional resources and models for either re-mining known or new datasets to suggest compounds that can be rapidly progressed all the way through to in vivo animal models, may lead to new clinical studies in a shorter time scale. There are many steps we could take to update our computational models such as incorporating the current data and using other machine learning algorithms. If we can in future narrow down the list of possible targets computationally as well and accelerate experimental target validation that will also be of importance. The combination of computational and experimental approaches represents a multistep workflow (S8 Fig) that was undertaken in this study that could be applicable in any NTD drug discovery project. Efforts to automate, streamline and learn from the resulting data would further increase the efficiency of the approach we have described.

Bottom Line: Ninety-seven compounds were selected for in vitro testing, and 11 of these were found to have EC50 < 10 μM.We have demonstrated how combining chemoinformatics and bioinformatics for T. cruzi drug discovery can bring interesting in vivo active molecules to light that may have been overlooked.The approach we have taken is broadly applicable to other NTDs.

View Article: PubMed Central - PubMed

Affiliation: Collaborative Drug Discovery, Burlingame, California, United States of America; Collaborations in Chemistry, Fuquay-Varina, North Carolina, United States of America.

ABSTRACT

Background: Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The current clinical and preclinical pipeline for T. cruzi is extremely sparse and lacks drug target diversity.

Methodology/principal findings: In the present study we developed a computational approach that utilized data from several public whole-cell, phenotypic high throughput screens that have been completed for T. cruzi by the Broad Institute, including a single screen of over 300,000 molecules in the search for chemical probes as part of the NIH Molecular Libraries program. We have also compiled and curated relevant biological and chemical compound screening data including (i) compounds and biological activity data from the literature, (ii) high throughput screening datasets, and (iii) predicted metabolites of T. cruzi metabolic pathways. This information was used to help us identify compounds and their potential targets. We have constructed a Pathway Genome Data Base for T. cruzi. In addition, we have developed Bayesian machine learning models that were used to virtually screen libraries of compounds. Ninety-seven compounds were selected for in vitro testing, and 11 of these were found to have EC50 < 10 μM. We progressed five compounds to an in vivo mouse efficacy model of Chagas disease and validated that the machine learning model could identify in vitro active compounds not in the training set, as well as known positive controls. The antimalarial pyronaridine possessed 85.2% efficacy in the acute Chagas mouse model. We have also proposed potential targets (for future verification) for this compound based on structural similarity to known compounds with targets in T. cruzi.

Conclusions/ significance: We have demonstrated how combining chemoinformatics and bioinformatics for T. cruzi drug discovery can bring interesting in vivo active molecules to light that may have been overlooked. The approach we have taken is broadly applicable to other NTDs.

No MeSH data available.


Related in: MedlinePlus