Limits...
A CTD-Pfizer collaboration: manual curation of 88,000 scientific articles text mined for drug-disease and drug-phenotype interactions.

Davis AP, Wiegers TC, Roberts PM, King BL, Lay JM, Lennon-Hopkins K, Sciaky D, Johnson R, Keating H, Greene N, Hernandez R, McConnell KJ, Enayetallah AE, Mattingly CJ - Database (Oxford) (2013)

Bottom Line: This curation can be leveraged for information about toxic endpoints important to drug safety and help develop testable hypotheses for drug-disease events.The availability of these detailed, contextualized, high-quality annotations curated from seven decades' worth of the scientific literature should help facilitate new mechanistic screening assays for pharmaceutical compound survival.This unique partnership demonstrates the importance of resource sharing and collaboration between public and private entities and underscores the complementary needs of the environmental health science and pharmaceutical communities.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, 3510 Thomas Hall, North Carolina State University, Raleigh, NC 27695-7617, USA, Computational Sciences Center of Emphasis, 200 Cambridgepark Drive, Pfizer Inc., Cambridge, MA 02139, USA, Department of Bioinformatics, P.O. Box 35, Old Bar Harbor Road, MDI Biological Laboratory, Salisbury Cove, ME 04672, USA, Compound Safety Prediction, MS 8118-B3, Eastern Point Road, Pfizer Inc., Groton, CT 06340, USA, Computational Sciences Center of Emphasis, Pfizer Inc., Ramsgate Road, Sandwich, Kent CT13 9NJ, UK, Computational Sciences Center of Emphasis, 558 Eastern Point Road, Pfizer Inc., Groton, CT 06340, USA and Drug Safety Research and Development, 558 Eastern Point Road, Pfizer Inc., Groton, CT 06340, USA.

ABSTRACT
Improving the prediction of chemical toxicity is a goal common to both environmental health research and pharmaceutical drug development. To improve safety detection assays, it is critical to have a reference set of molecules with well-defined toxicity annotations for training and validation purposes. Here, we describe a collaboration between safety researchers at Pfizer and the research team at the Comparative Toxicogenomics Database (CTD) to text mine and manually review a collection of 88,629 articles relating over 1,200 pharmaceutical drugs to their potential involvement in cardiovascular, neurological, renal and hepatic toxicity. In 1 year, CTD biocurators curated 254,173 toxicogenomic interactions (152,173 chemical-disease, 58,572 chemical-gene, 5,345 gene-disease and 38,083 phenotype interactions). All chemical-gene-disease interactions are fully integrated with public CTD, and phenotype interactions can be downloaded. We describe Pfizer's text-mining process to collate the articles, and CTD's curation strategy, performance metrics, enhanced data content and new module to curate phenotype information. As well, we show how data integration can connect phenotypes to diseases. This curation can be leveraged for information about toxic endpoints important to drug safety and help develop testable hypotheses for drug-disease events. The availability of these detailed, contextualized, high-quality annotations curated from seven decades' worth of the scientific literature should help facilitate new mechanistic screening assays for pharmaceutical compound survival. This unique partnership demonstrates the importance of resource sharing and collaboration between public and private entities and underscores the complementary needs of the environmental health science and pharmaceutical communities. Database URL: http://ctdbase.org/

Show MeSH

Related in: MedlinePlus

Enhanced content helps develop testable hypotheses for known drug–disease events. CTD’s page for the drug bortezomib is selected for ‘Diseases’ data (orange tab), and the results have been filtered for the category ‘Nervous system disease’ (red circle) to focus on NeuroTox events. Bortezomib is inferred to peripheral neuropathy by 150 genes (red arrow, ‘Inference Network’). Embedded web tools automatically generate lists of enriched GO terms, pathway annotations and gene–gene interaction maps (blue arrows).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3842776&req=5

bat080-F7: Enhanced content helps develop testable hypotheses for known drug–disease events. CTD’s page for the drug bortezomib is selected for ‘Diseases’ data (orange tab), and the results have been filtered for the category ‘Nervous system disease’ (red circle) to focus on NeuroTox events. Bortezomib is inferred to peripheral neuropathy by 150 genes (red arrow, ‘Inference Network’). Embedded web tools automatically generate lists of enriched GO terms, pathway annotations and gene–gene interaction maps (blue arrows).

Mentions: This enhanced curated content can now be used to fill in the molecular gaps and find putative genes and pathways for developing testable hypotheses for drug–disease processes since CTD provides inference networks of genes that connect chemicals to diseases (11). For example, the drug bortezomib (a proteasome inhibitor used to treat multiple myeloma) is known to cause peripheral neuropathy in some patients, but the mechanistic process is still not clear (29). CTD discovers 150 genes that connect bortezomib to peripheral neuropathy, and the embedded web tools automatically calculate the enriched GO terms, pathway annotations and interaction maps for those connecting genes (Figure 7). This sophisticated knowledge management system can help researchers generate novel hypotheses about expanded molecular pathways of the drug–disease event and facilitate new screening assays for future pharmaceutical compound survival.Figure 7.


A CTD-Pfizer collaboration: manual curation of 88,000 scientific articles text mined for drug-disease and drug-phenotype interactions.

Davis AP, Wiegers TC, Roberts PM, King BL, Lay JM, Lennon-Hopkins K, Sciaky D, Johnson R, Keating H, Greene N, Hernandez R, McConnell KJ, Enayetallah AE, Mattingly CJ - Database (Oxford) (2013)

Enhanced content helps develop testable hypotheses for known drug–disease events. CTD’s page for the drug bortezomib is selected for ‘Diseases’ data (orange tab), and the results have been filtered for the category ‘Nervous system disease’ (red circle) to focus on NeuroTox events. Bortezomib is inferred to peripheral neuropathy by 150 genes (red arrow, ‘Inference Network’). Embedded web tools automatically generate lists of enriched GO terms, pathway annotations and gene–gene interaction maps (blue arrows).
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3842776&req=5

bat080-F7: Enhanced content helps develop testable hypotheses for known drug–disease events. CTD’s page for the drug bortezomib is selected for ‘Diseases’ data (orange tab), and the results have been filtered for the category ‘Nervous system disease’ (red circle) to focus on NeuroTox events. Bortezomib is inferred to peripheral neuropathy by 150 genes (red arrow, ‘Inference Network’). Embedded web tools automatically generate lists of enriched GO terms, pathway annotations and gene–gene interaction maps (blue arrows).
Mentions: This enhanced curated content can now be used to fill in the molecular gaps and find putative genes and pathways for developing testable hypotheses for drug–disease processes since CTD provides inference networks of genes that connect chemicals to diseases (11). For example, the drug bortezomib (a proteasome inhibitor used to treat multiple myeloma) is known to cause peripheral neuropathy in some patients, but the mechanistic process is still not clear (29). CTD discovers 150 genes that connect bortezomib to peripheral neuropathy, and the embedded web tools automatically calculate the enriched GO terms, pathway annotations and interaction maps for those connecting genes (Figure 7). This sophisticated knowledge management system can help researchers generate novel hypotheses about expanded molecular pathways of the drug–disease event and facilitate new screening assays for future pharmaceutical compound survival.Figure 7.

Bottom Line: This curation can be leveraged for information about toxic endpoints important to drug safety and help develop testable hypotheses for drug-disease events.The availability of these detailed, contextualized, high-quality annotations curated from seven decades' worth of the scientific literature should help facilitate new mechanistic screening assays for pharmaceutical compound survival.This unique partnership demonstrates the importance of resource sharing and collaboration between public and private entities and underscores the complementary needs of the environmental health science and pharmaceutical communities.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, 3510 Thomas Hall, North Carolina State University, Raleigh, NC 27695-7617, USA, Computational Sciences Center of Emphasis, 200 Cambridgepark Drive, Pfizer Inc., Cambridge, MA 02139, USA, Department of Bioinformatics, P.O. Box 35, Old Bar Harbor Road, MDI Biological Laboratory, Salisbury Cove, ME 04672, USA, Compound Safety Prediction, MS 8118-B3, Eastern Point Road, Pfizer Inc., Groton, CT 06340, USA, Computational Sciences Center of Emphasis, Pfizer Inc., Ramsgate Road, Sandwich, Kent CT13 9NJ, UK, Computational Sciences Center of Emphasis, 558 Eastern Point Road, Pfizer Inc., Groton, CT 06340, USA and Drug Safety Research and Development, 558 Eastern Point Road, Pfizer Inc., Groton, CT 06340, USA.

ABSTRACT
Improving the prediction of chemical toxicity is a goal common to both environmental health research and pharmaceutical drug development. To improve safety detection assays, it is critical to have a reference set of molecules with well-defined toxicity annotations for training and validation purposes. Here, we describe a collaboration between safety researchers at Pfizer and the research team at the Comparative Toxicogenomics Database (CTD) to text mine and manually review a collection of 88,629 articles relating over 1,200 pharmaceutical drugs to their potential involvement in cardiovascular, neurological, renal and hepatic toxicity. In 1 year, CTD biocurators curated 254,173 toxicogenomic interactions (152,173 chemical-disease, 58,572 chemical-gene, 5,345 gene-disease and 38,083 phenotype interactions). All chemical-gene-disease interactions are fully integrated with public CTD, and phenotype interactions can be downloaded. We describe Pfizer's text-mining process to collate the articles, and CTD's curation strategy, performance metrics, enhanced data content and new module to curate phenotype information. As well, we show how data integration can connect phenotypes to diseases. This curation can be leveraged for information about toxic endpoints important to drug safety and help develop testable hypotheses for drug-disease events. The availability of these detailed, contextualized, high-quality annotations curated from seven decades' worth of the scientific literature should help facilitate new mechanistic screening assays for pharmaceutical compound survival. This unique partnership demonstrates the importance of resource sharing and collaboration between public and private entities and underscores the complementary needs of the environmental health science and pharmaceutical communities. Database URL: http://ctdbase.org/

Show MeSH
Related in: MedlinePlus