Limits...
GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution.

Kirsten T, Gross A, Hartung M, Rahm E - J Biomed Semantics (2011)

Bottom Line: Their increasing size and the high frequency of updates resulting in a large set of ontology versions necessitates efficient management and analysis of this data.We introduce the component-based infrastructure and show analysis results for selected components and life science applications.GOMMA complements OnEX by providing functionalities to manage various versions of mappings between two ontologies and allows combining different match approaches.

View Article: PubMed Central - HTML - PubMed

Affiliation: Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany. tkirsten@izbi.uni-leipzig.de.

ABSTRACT

Background: Ontologies are increasingly used to structure and semantically describe entities of domains, such as genes and proteins in life sciences. Their increasing size and the high frequency of updates resulting in a large set of ontology versions necessitates efficient management and analysis of this data.

Results: We present GOMMA, a generic infrastructure for managing and analyzing life science ontologies and their evolution. GOMMA utilizes a generic repository to uniformly and efficiently manage ontology versions and different kinds of mappings. Furthermore, it provides components for ontology matching, and determining evolutionary ontology changes. These components are used by analysis tools, such as the Ontology Evolution Explorer (OnEX) and the detection of unstable ontology regions. We introduce the component-based infrastructure and show analysis results for selected components and life science applications. GOMMA is available at http://dbs.uni-leipzig.de/GOMMA.

Conclusions: GOMMA provides a comprehensive and scalable infrastructure to manage large life science ontologies and analyze their evolution. Key functions include a generic storage of ontology versions and mappings, support for ontology matching and determining ontology changes. The supported features for analyzing ontology changes are helpful to assess their impact on ontology-dependent applications such as for term enrichment. GOMMA complements OnEX by providing functionalities to manage various versions of mappings between two ontologies and allows combining different match approaches.

No MeSH data available.


Application scenario: Term enrichment analysis. The figure shows analysis results for a term enrichment analysis of a gene set using a hypergeometric test from the FUNC package [11]. The experiment was executed for two Gene Ontology Molecular Function (GO-MF) versions: 2009-09 (a) and 2011-03 (b). The gene and annotation set were not modified. Colored categories denote significantly enriched categories w.r.t. the used gene set and ontology version. The table (c) shows more detailed information for each significant category, e.g., the number of indirect (propagated) gene annotations (/A/).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3198872&req=5

Figure 4: Application scenario: Term enrichment analysis. The figure shows analysis results for a term enrichment analysis of a gene set using a hypergeometric test from the FUNC package [11]. The experiment was executed for two Gene Ontology Molecular Function (GO-MF) versions: 2009-09 (a) and 2011-03 (b). The gene and annotation set were not modified. Colored categories denote significantly enriched categories w.r.t. the used gene set and ontology version. The table (c) shows more detailed information for each significant category, e.g., the number of indirect (propagated) gene annotations (/A/).

Mentions: We run a term enrichment analysis using the hypergeometric test from the FUNC package [11] on a publicly available example data set (http://fasta.bioch.virginia.edu/cshl/stubbs/data/TF1/TF1_ForFUNC_Hyper.txt). This data set was initially based on Gene Ontology version 2009-09. We repeated the analysis using the original as well as a newer GO version (2011-03) and compared the result sets for the Molecular Functions part of GO (GO-MF). Figure 4(a) and 4(b) show the GO-MF subgraphs with significant result categories of this analysis. We observe that the statistical test for the new GO-MF version leads to a significantly changed result set. In particular, one category (in orange color) is no longer in the new result set while three categories (in green) appear as significant only for the new version. Only two categories (yellow) are present in both result sets. This indicates that results of such term enrichment analyses can be highly dependent on ontology evolution and, thus, the used ontology version. When introducing the GOMMA functions for evolution analysis in the following, we also explain their usability for the example scenario.


GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution.

Kirsten T, Gross A, Hartung M, Rahm E - J Biomed Semantics (2011)

Application scenario: Term enrichment analysis. The figure shows analysis results for a term enrichment analysis of a gene set using a hypergeometric test from the FUNC package [11]. The experiment was executed for two Gene Ontology Molecular Function (GO-MF) versions: 2009-09 (a) and 2011-03 (b). The gene and annotation set were not modified. Colored categories denote significantly enriched categories w.r.t. the used gene set and ontology version. The table (c) shows more detailed information for each significant category, e.g., the number of indirect (propagated) gene annotations (/A/).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3198872&req=5

Figure 4: Application scenario: Term enrichment analysis. The figure shows analysis results for a term enrichment analysis of a gene set using a hypergeometric test from the FUNC package [11]. The experiment was executed for two Gene Ontology Molecular Function (GO-MF) versions: 2009-09 (a) and 2011-03 (b). The gene and annotation set were not modified. Colored categories denote significantly enriched categories w.r.t. the used gene set and ontology version. The table (c) shows more detailed information for each significant category, e.g., the number of indirect (propagated) gene annotations (/A/).
Mentions: We run a term enrichment analysis using the hypergeometric test from the FUNC package [11] on a publicly available example data set (http://fasta.bioch.virginia.edu/cshl/stubbs/data/TF1/TF1_ForFUNC_Hyper.txt). This data set was initially based on Gene Ontology version 2009-09. We repeated the analysis using the original as well as a newer GO version (2011-03) and compared the result sets for the Molecular Functions part of GO (GO-MF). Figure 4(a) and 4(b) show the GO-MF subgraphs with significant result categories of this analysis. We observe that the statistical test for the new GO-MF version leads to a significantly changed result set. In particular, one category (in orange color) is no longer in the new result set while three categories (in green) appear as significant only for the new version. Only two categories (yellow) are present in both result sets. This indicates that results of such term enrichment analyses can be highly dependent on ontology evolution and, thus, the used ontology version. When introducing the GOMMA functions for evolution analysis in the following, we also explain their usability for the example scenario.

Bottom Line: Their increasing size and the high frequency of updates resulting in a large set of ontology versions necessitates efficient management and analysis of this data.We introduce the component-based infrastructure and show analysis results for selected components and life science applications.GOMMA complements OnEX by providing functionalities to manage various versions of mappings between two ontologies and allows combining different match approaches.

View Article: PubMed Central - HTML - PubMed

Affiliation: Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstraße 16-18, 04107 Leipzig, Germany. tkirsten@izbi.uni-leipzig.de.

ABSTRACT

Background: Ontologies are increasingly used to structure and semantically describe entities of domains, such as genes and proteins in life sciences. Their increasing size and the high frequency of updates resulting in a large set of ontology versions necessitates efficient management and analysis of this data.

Results: We present GOMMA, a generic infrastructure for managing and analyzing life science ontologies and their evolution. GOMMA utilizes a generic repository to uniformly and efficiently manage ontology versions and different kinds of mappings. Furthermore, it provides components for ontology matching, and determining evolutionary ontology changes. These components are used by analysis tools, such as the Ontology Evolution Explorer (OnEX) and the detection of unstable ontology regions. We introduce the component-based infrastructure and show analysis results for selected components and life science applications. GOMMA is available at http://dbs.uni-leipzig.de/GOMMA.

Conclusions: GOMMA provides a comprehensive and scalable infrastructure to manage large life science ontologies and analyze their evolution. Key functions include a generic storage of ontology versions and mappings, support for ontology matching and determining ontology changes. The supported features for analyzing ontology changes are helpful to assess their impact on ontology-dependent applications such as for term enrichment. GOMMA complements OnEX by providing functionalities to manage various versions of mappings between two ontologies and allows combining different match approaches.

No MeSH data available.