Limits...
Linking the Resource Description Framework to cheminformatics and proteochemometrics.

Willighagen EL, Alvarsson J, Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O, Wikberg JE - J Biomed Semantics (2011)

Bottom Line: Ontologies and semantic markup have already been used for more than a decade in molecular sciences, but have not found widespread use yet.Being able to create and share workflows that integrate data aggregation and analysis (visual and statistical) is beneficial to interoperability and reproducibility.The current work shows that RDF approaches are sufficiently powerful to support molecular chemometrics workflows.

View Article: PubMed Central - HTML - PubMed

Affiliation: Uppsala University, Department of Pharmaceutical Biosciences, Box 591, SE-751 24 Uppsala, Sweden. egon.willighagen@farmbio.uu.se.

ABSTRACT

Background: Semantic web technologies are finding their way into the life sciences. Ontologies and semantic markup have already been used for more than a decade in molecular sciences, but have not found widespread use yet. The semantic web technology Resource Description Framework (RDF) and related methods show to be sufficiently versatile to change that situation.

Results: The work presented here focuses on linking RDF approaches to existing molecular chemometrics fields, including cheminformatics, QSAR modeling and proteochemometrics. Applications are presented that link RDF technologies to methods from statistics and cheminformatics, including data aggregation, visualization, chemical identification, and property prediction. They demonstrate how this can be done using various existing RDF standards and cheminformatics libraries. For example, we show how IC50 and Ki values are modeled for a number of biological targets using data from the ChEMBL database.

Conclusions: We have shown that existing RDF standards can suitably be integrated into existing molecular chemometrics methods. Platforms that unite these technologies, like Bioclipse, makes this even simpler and more transparent. Being able to create and share workflows that integrate data aggregation and analysis (visual and statistical) is beneficial to interoperability and reproducibility. The current work shows that RDF approaches are sufficiently powerful to support molecular chemometrics workflows.

No MeSH data available.


Related in: MedlinePlus

Screenshot of one of the Bioclipse Wizard pages to set up a new QSAR project. The wizard allows the user to interactively select a target and activity using SPARQL functionality to download title, type, and organism details for the currently selected target. The wizard automatically updates the list of allowable activity types for the given target, being the sialidase target in this example.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3105498&req=5

Figure 10: Screenshot of one of the Bioclipse Wizard pages to set up a new QSAR project. The wizard allows the user to interactively select a target and activity using SPARQL functionality to download title, type, and organism details for the currently selected target. The wizard automatically updates the list of allowable activity types for the given target, being the sialidase target in this example.

Mentions: A second plugin uses this new functionality to integrate the ChEMBL SPARQL end point with the QSAR feature of Bioclipse [45]. The plugin provides a New Wizard to bootstrap a new QSAR project by aggregating data from the ChEMBL database directly. It accepts a ChEMBL targetID and an activity type (e.g. IC50 or Kd), as shown in the screenshot in Figure 10. This new wizard uses SPARQL to update the wizard page with information about the currently given targetID. While the user is typing the targetID number, SPARQL is being used, via the aforementioned wrapping API, to ask the RDF database about the title, type and organism of the current target. Additionally, it will query the database for available activity types, such as the IC50, Inhibition, Ki app, Ki, and a general Activity for the 101107 targetID given in the figure. The wizard for Bioclipse does not yet provide full text search for targets based on labels, keywords, and descriptions available in the ChEMBL database, but it is clear that SPARQL make such applications possible too.


Linking the Resource Description Framework to cheminformatics and proteochemometrics.

Willighagen EL, Alvarsson J, Andersson A, Eklund M, Lampa S, Lapins M, Spjuth O, Wikberg JE - J Biomed Semantics (2011)

Screenshot of one of the Bioclipse Wizard pages to set up a new QSAR project. The wizard allows the user to interactively select a target and activity using SPARQL functionality to download title, type, and organism details for the currently selected target. The wizard automatically updates the list of allowable activity types for the given target, being the sialidase target in this example.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3105498&req=5

Figure 10: Screenshot of one of the Bioclipse Wizard pages to set up a new QSAR project. The wizard allows the user to interactively select a target and activity using SPARQL functionality to download title, type, and organism details for the currently selected target. The wizard automatically updates the list of allowable activity types for the given target, being the sialidase target in this example.
Mentions: A second plugin uses this new functionality to integrate the ChEMBL SPARQL end point with the QSAR feature of Bioclipse [45]. The plugin provides a New Wizard to bootstrap a new QSAR project by aggregating data from the ChEMBL database directly. It accepts a ChEMBL targetID and an activity type (e.g. IC50 or Kd), as shown in the screenshot in Figure 10. This new wizard uses SPARQL to update the wizard page with information about the currently given targetID. While the user is typing the targetID number, SPARQL is being used, via the aforementioned wrapping API, to ask the RDF database about the title, type and organism of the current target. Additionally, it will query the database for available activity types, such as the IC50, Inhibition, Ki app, Ki, and a general Activity for the 101107 targetID given in the figure. The wizard for Bioclipse does not yet provide full text search for targets based on labels, keywords, and descriptions available in the ChEMBL database, but it is clear that SPARQL make such applications possible too.

Bottom Line: Ontologies and semantic markup have already been used for more than a decade in molecular sciences, but have not found widespread use yet.Being able to create and share workflows that integrate data aggregation and analysis (visual and statistical) is beneficial to interoperability and reproducibility.The current work shows that RDF approaches are sufficiently powerful to support molecular chemometrics workflows.

View Article: PubMed Central - HTML - PubMed

Affiliation: Uppsala University, Department of Pharmaceutical Biosciences, Box 591, SE-751 24 Uppsala, Sweden. egon.willighagen@farmbio.uu.se.

ABSTRACT

Background: Semantic web technologies are finding their way into the life sciences. Ontologies and semantic markup have already been used for more than a decade in molecular sciences, but have not found widespread use yet. The semantic web technology Resource Description Framework (RDF) and related methods show to be sufficiently versatile to change that situation.

Results: The work presented here focuses on linking RDF approaches to existing molecular chemometrics fields, including cheminformatics, QSAR modeling and proteochemometrics. Applications are presented that link RDF technologies to methods from statistics and cheminformatics, including data aggregation, visualization, chemical identification, and property prediction. They demonstrate how this can be done using various existing RDF standards and cheminformatics libraries. For example, we show how IC50 and Ki values are modeled for a number of biological targets using data from the ChEMBL database.

Conclusions: We have shown that existing RDF standards can suitably be integrated into existing molecular chemometrics methods. Platforms that unite these technologies, like Bioclipse, makes this even simpler and more transparent. Being able to create and share workflows that integrate data aggregation and analysis (visual and statistical) is beneficial to interoperability and reproducibility. The current work shows that RDF approaches are sufficiently powerful to support molecular chemometrics workflows.

No MeSH data available.


Related in: MedlinePlus