Limits...
SWEETLEAD: an in silico database of approved drugs, regulated chemicals, and herbal isolates for computer-aided drug discovery.

Novick PA, Ortiz OF, Poelman J, Abdulhay AY, Pande VS - PLoS ONE (2013)

Bottom Line: Computer-aided virtual screening and repurposing approved drugs are two such strategies that have shown recent success.A consensus building scheme surveying information from several publicly accessible databases was employed to identify the correct structure for each chemical.Resulting structures are filtered for the active pharmaceutical ingredient, standardized, and differing formulations of the same drug were combined in the final database.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemistry, Stanford University, Stanford, California, United States of America.

ABSTRACT
In the face of drastically rising drug discovery costs, strategies promising to reduce development timelines and expenditures are being pursued. Computer-aided virtual screening and repurposing approved drugs are two such strategies that have shown recent success. Herein, we report the creation of a highly-curated in silico database of chemical structures representing approved drugs, chemical isolates from traditional medicinal herbs, and regulated chemicals, termed the SWEETLEAD database. The motivation for SWEETLEAD stems from the observance of conflicting information in publicly available chemical databases and the lack of a highly curated database of chemical structures for the globally approved drugs. A consensus building scheme surveying information from several publicly accessible databases was employed to identify the correct structure for each chemical. Resulting structures are filtered for the active pharmaceutical ingredient, standardized, and differing formulations of the same drug were combined in the final database. The publically available release of SWEETLEAD (https://simtk.org/home/sweetlead) provides an important tool to enable the successful completion of computer-aided repurposing and drug discovery campaigns.

Show MeSH

Related in: MedlinePlus

Workflow of the consensus building algorithm.The described process of identifying a correct structure for a given drug begins with a drug or chemical name. In the first stage of the algorithm, the Data Collection stage, several databases are polled by the name and the database IDs linked to that name are retrieved and ranked by frequency each ID was returned (i.e., which ID is ‘most popular’ among databases polled). For each ID returned, the chemical structure associated with that ID is retrieved and standardized (salts removed, standard protonation states and aromaticity models, etc.). In the second stage, the Data Curation stage, the most popular structures from each database are compared. If all structures match, then the structure is assumed to be correct and is assigned to the drug name in the final SWEETLEAD database. If the structures do not match, an iterative cycling through the most popular structures for each database attempts to identify a consensus structure for the drug name. If a consensus or majority structure can not be identified, a manual review is undertaken. Finally, duplicate structures in SWEETLEAD are combined, to allow for numerous brand names and other identifiers for approved drugs.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3815155&req=5

pone-0079568-g002: Workflow of the consensus building algorithm.The described process of identifying a correct structure for a given drug begins with a drug or chemical name. In the first stage of the algorithm, the Data Collection stage, several databases are polled by the name and the database IDs linked to that name are retrieved and ranked by frequency each ID was returned (i.e., which ID is ‘most popular’ among databases polled). For each ID returned, the chemical structure associated with that ID is retrieved and standardized (salts removed, standard protonation states and aromaticity models, etc.). In the second stage, the Data Curation stage, the most popular structures from each database are compared. If all structures match, then the structure is assumed to be correct and is assigned to the drug name in the final SWEETLEAD database. If the structures do not match, an iterative cycling through the most popular structures for each database attempts to identify a consensus structure for the drug name. If a consensus or majority structure can not be identified, a manual review is undertaken. Finally, duplicate structures in SWEETLEAD are combined, to allow for numerous brand names and other identifiers for approved drugs.

Mentions: The overall workflow of our tool is shown in Figure 2, and can largely be divided into data collection and data curation stages. In the first step of the data collection stage, a drug name is taken as input. Several Reference Databases are then queried using this name, and the internal and external Structural Database IDs from each are collected. The list of Structural Database IDs returned from the previous step is then sorted according to how frequently they were returned; a database ID returned by all queried sources would thus become the ‘most popular’ and highest ranked ID, and less frequently returned IDs would be lower ranked. Next, each Structural Database is then queried by database ID, and the molecular structure associated with each ID is collected (preferably, as a 2-D sdf file). Given that each Structural Database uses a unique processing workflow in creating these structure files, including differing aromaticity models and protonation states, it is then necessary to standardize these structures prior to comparison. Standardization is accomplished by stripping salts and other non-API fragments, assigning specific chirality (when appropriate), applying a common aromaticity model, protonating the API as predicted at pH 7, and finally representing the structure as an isomeric SMILES string.


SWEETLEAD: an in silico database of approved drugs, regulated chemicals, and herbal isolates for computer-aided drug discovery.

Novick PA, Ortiz OF, Poelman J, Abdulhay AY, Pande VS - PLoS ONE (2013)

Workflow of the consensus building algorithm.The described process of identifying a correct structure for a given drug begins with a drug or chemical name. In the first stage of the algorithm, the Data Collection stage, several databases are polled by the name and the database IDs linked to that name are retrieved and ranked by frequency each ID was returned (i.e., which ID is ‘most popular’ among databases polled). For each ID returned, the chemical structure associated with that ID is retrieved and standardized (salts removed, standard protonation states and aromaticity models, etc.). In the second stage, the Data Curation stage, the most popular structures from each database are compared. If all structures match, then the structure is assumed to be correct and is assigned to the drug name in the final SWEETLEAD database. If the structures do not match, an iterative cycling through the most popular structures for each database attempts to identify a consensus structure for the drug name. If a consensus or majority structure can not be identified, a manual review is undertaken. Finally, duplicate structures in SWEETLEAD are combined, to allow for numerous brand names and other identifiers for approved drugs.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3815155&req=5

pone-0079568-g002: Workflow of the consensus building algorithm.The described process of identifying a correct structure for a given drug begins with a drug or chemical name. In the first stage of the algorithm, the Data Collection stage, several databases are polled by the name and the database IDs linked to that name are retrieved and ranked by frequency each ID was returned (i.e., which ID is ‘most popular’ among databases polled). For each ID returned, the chemical structure associated with that ID is retrieved and standardized (salts removed, standard protonation states and aromaticity models, etc.). In the second stage, the Data Curation stage, the most popular structures from each database are compared. If all structures match, then the structure is assumed to be correct and is assigned to the drug name in the final SWEETLEAD database. If the structures do not match, an iterative cycling through the most popular structures for each database attempts to identify a consensus structure for the drug name. If a consensus or majority structure can not be identified, a manual review is undertaken. Finally, duplicate structures in SWEETLEAD are combined, to allow for numerous brand names and other identifiers for approved drugs.
Mentions: The overall workflow of our tool is shown in Figure 2, and can largely be divided into data collection and data curation stages. In the first step of the data collection stage, a drug name is taken as input. Several Reference Databases are then queried using this name, and the internal and external Structural Database IDs from each are collected. The list of Structural Database IDs returned from the previous step is then sorted according to how frequently they were returned; a database ID returned by all queried sources would thus become the ‘most popular’ and highest ranked ID, and less frequently returned IDs would be lower ranked. Next, each Structural Database is then queried by database ID, and the molecular structure associated with each ID is collected (preferably, as a 2-D sdf file). Given that each Structural Database uses a unique processing workflow in creating these structure files, including differing aromaticity models and protonation states, it is then necessary to standardize these structures prior to comparison. Standardization is accomplished by stripping salts and other non-API fragments, assigning specific chirality (when appropriate), applying a common aromaticity model, protonating the API as predicted at pH 7, and finally representing the structure as an isomeric SMILES string.

Bottom Line: Computer-aided virtual screening and repurposing approved drugs are two such strategies that have shown recent success.A consensus building scheme surveying information from several publicly accessible databases was employed to identify the correct structure for each chemical.Resulting structures are filtered for the active pharmaceutical ingredient, standardized, and differing formulations of the same drug were combined in the final database.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemistry, Stanford University, Stanford, California, United States of America.

ABSTRACT
In the face of drastically rising drug discovery costs, strategies promising to reduce development timelines and expenditures are being pursued. Computer-aided virtual screening and repurposing approved drugs are two such strategies that have shown recent success. Herein, we report the creation of a highly-curated in silico database of chemical structures representing approved drugs, chemical isolates from traditional medicinal herbs, and regulated chemicals, termed the SWEETLEAD database. The motivation for SWEETLEAD stems from the observance of conflicting information in publicly available chemical databases and the lack of a highly curated database of chemical structures for the globally approved drugs. A consensus building scheme surveying information from several publicly accessible databases was employed to identify the correct structure for each chemical. Resulting structures are filtered for the active pharmaceutical ingredient, standardized, and differing formulations of the same drug were combined in the final database. The publically available release of SWEETLEAD (https://simtk.org/home/sweetlead) provides an important tool to enable the successful completion of computer-aided repurposing and drug discovery campaigns.

Show MeSH
Related in: MedlinePlus