Limits...
Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions.

Hur J, Özgür A, Xiang Z, He Y - J Biomed Semantics (2015)

Bottom Line: Using INO-based literature mining results, a modified Fisher's exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area.Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher's exact test.The analysis of these interaction types and their associated gene-gene pairs uncovered many scientific insights.

View Article: PubMed Central - PubMed

Affiliation: Department of Neurology, University of Michigan, Ann Arbor, MI 48109 USA.

ABSTRACT

Background: Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords.

Methods: In this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorporates interaction terms from the PSI Molecular Interactions (PSI-MI) and Gene Ontology (GO). Using INO-based literature mining results, a modified Fisher's exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area. Such a strategy was applied to study the vaccine-mediated gene-gene interactions using all PubMed abstracts. The Vaccine Ontology (VO) and INO were used to support the retrieval of vaccine terms and interaction keywords from the literature.

Results: INO is aligned with the Basic Formal Ontology (BFO) and imports terms from 10 other existing ontologies. Current INO includes 540 terms. In terms of interaction-related terms, INO imports and aligns PSI-MI and GO interaction terms and includes over 100 newly generated ontology terms with 'INO_' prefix. A new annotation property, 'has literature mining keywords', was generated to allow the listing of different keywords mapping to the interaction types in INO. Using all PubMed documents published as of 12/31/2013, approximately 266,000 vaccine-associated documents were identified, and a total of 6,116 gene-pairs were associated with at least one INO term. Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher's exact test. These over-represented and under-represented terms share some common top-level terms but are distinct at the bottom levels of the INO hierarchy. The analysis of these interaction types and their associated gene-gene pairs uncovered many scientific insights.

Conclusions: INO provides a novel approach for defining hierarchical interaction types and related keywords for literature mining. The ontology-based literature mining, in combination with an INO-based statistical interaction enrichment test, provides a new platform for efficient mining and analysis of topic-specific gene interaction networks.

No MeSH data available.


Related in: MedlinePlus

The hierarchies of over- and under-represented INO interaction terms. (A) The hierarchy of 14 over-represented INO interaction terms. (B) The hierarchy of 17 under-represented INO interaction terms. The results were generated using OntoFox [9] with the OntoFox setting “includeComputedIntermediates”, and visualized using the Protege-OWL editor (http://protege.stanford.edu/). The box-enclosed terms are over- or under-represented interaction types directly identified in our program (see Tables 2 and 3). Other terms not enclosed in boxes are terms retrieved by OntoFox to ensure the completeness of the hierarchies.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4362819&req=5

Fig4: The hierarchies of over- and under-represented INO interaction terms. (A) The hierarchy of 14 over-represented INO interaction terms. (B) The hierarchy of 17 under-represented INO interaction terms. The results were generated using OntoFox [9] with the OntoFox setting “includeComputedIntermediates”, and visualized using the Protege-OWL editor (http://protege.stanford.edu/). The box-enclosed terms are over- or under-represented interaction types directly identified in our program (see Tables 2 and 3). Other terms not enclosed in boxes are terms retrieved by OntoFox to ensure the completeness of the hierarchies.

Mentions: One advantage of INO-based study is that we can rely on the INO hierarchy to identify the relations among enriched interaction types. Such a strategy is used to generate the hierarchies of enriched 14 over-represented and 17 under-represented INO interaction types (Figure 4). This study clearly shows the relations between many different interaction terms. For example, among the three over-represented terms, ‘mRNA cleavage’, ‘RNA cleavage’, and ‘nucleic acid cleavage’, there are two parent–child relations as clearly shown in Figure 4. Interestingly, the term ‘cleavage reaction’ is one of the 17 under-represented terms (Table 3). It is noted that the more general term ‘cleavage reaction’ is the parent term of ‘nucleic acid cleavage’, which is the parent term of ‘RNA cleavage’ (Figure 4). The term ‘RNA cleavage’ has a child term ‘mRNA cleavage’. Besides these cleavage types, there are many other specific ‘cleavage reaction’ types, for example, protein cleavage, DNA cleavage, and lipid cleavage. In our calculation of the parent term ‘cleavage reaction’, we included all its child terms. Therefore, the under-represented ‘cleavage reaction’ indicates that the whole category of cleavage reaction is under-represented although the above three specific reaction types are over-represented.


Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions.

Hur J, Özgür A, Xiang Z, He Y - J Biomed Semantics (2015)

The hierarchies of over- and under-represented INO interaction terms. (A) The hierarchy of 14 over-represented INO interaction terms. (B) The hierarchy of 17 under-represented INO interaction terms. The results were generated using OntoFox [9] with the OntoFox setting “includeComputedIntermediates”, and visualized using the Protege-OWL editor (http://protege.stanford.edu/). The box-enclosed terms are over- or under-represented interaction types directly identified in our program (see Tables 2 and 3). Other terms not enclosed in boxes are terms retrieved by OntoFox to ensure the completeness of the hierarchies.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4362819&req=5

Fig4: The hierarchies of over- and under-represented INO interaction terms. (A) The hierarchy of 14 over-represented INO interaction terms. (B) The hierarchy of 17 under-represented INO interaction terms. The results were generated using OntoFox [9] with the OntoFox setting “includeComputedIntermediates”, and visualized using the Protege-OWL editor (http://protege.stanford.edu/). The box-enclosed terms are over- or under-represented interaction types directly identified in our program (see Tables 2 and 3). Other terms not enclosed in boxes are terms retrieved by OntoFox to ensure the completeness of the hierarchies.
Mentions: One advantage of INO-based study is that we can rely on the INO hierarchy to identify the relations among enriched interaction types. Such a strategy is used to generate the hierarchies of enriched 14 over-represented and 17 under-represented INO interaction types (Figure 4). This study clearly shows the relations between many different interaction terms. For example, among the three over-represented terms, ‘mRNA cleavage’, ‘RNA cleavage’, and ‘nucleic acid cleavage’, there are two parent–child relations as clearly shown in Figure 4. Interestingly, the term ‘cleavage reaction’ is one of the 17 under-represented terms (Table 3). It is noted that the more general term ‘cleavage reaction’ is the parent term of ‘nucleic acid cleavage’, which is the parent term of ‘RNA cleavage’ (Figure 4). The term ‘RNA cleavage’ has a child term ‘mRNA cleavage’. Besides these cleavage types, there are many other specific ‘cleavage reaction’ types, for example, protein cleavage, DNA cleavage, and lipid cleavage. In our calculation of the parent term ‘cleavage reaction’, we included all its child terms. Therefore, the under-represented ‘cleavage reaction’ indicates that the whole category of cleavage reaction is under-represented although the above three specific reaction types are over-represented.

Bottom Line: Using INO-based literature mining results, a modified Fisher's exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area.Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher's exact test.The analysis of these interaction types and their associated gene-gene pairs uncovered many scientific insights.

View Article: PubMed Central - PubMed

Affiliation: Department of Neurology, University of Michigan, Ann Arbor, MI 48109 USA.

ABSTRACT

Background: Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords.

Methods: In this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorporates interaction terms from the PSI Molecular Interactions (PSI-MI) and Gene Ontology (GO). Using INO-based literature mining results, a modified Fisher's exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area. Such a strategy was applied to study the vaccine-mediated gene-gene interactions using all PubMed abstracts. The Vaccine Ontology (VO) and INO were used to support the retrieval of vaccine terms and interaction keywords from the literature.

Results: INO is aligned with the Basic Formal Ontology (BFO) and imports terms from 10 other existing ontologies. Current INO includes 540 terms. In terms of interaction-related terms, INO imports and aligns PSI-MI and GO interaction terms and includes over 100 newly generated ontology terms with 'INO_' prefix. A new annotation property, 'has literature mining keywords', was generated to allow the listing of different keywords mapping to the interaction types in INO. Using all PubMed documents published as of 12/31/2013, approximately 266,000 vaccine-associated documents were identified, and a total of 6,116 gene-pairs were associated with at least one INO term. Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher's exact test. These over-represented and under-represented terms share some common top-level terms but are distinct at the bottom levels of the INO hierarchy. The analysis of these interaction types and their associated gene-gene pairs uncovered many scientific insights.

Conclusions: INO provides a novel approach for defining hierarchical interaction types and related keywords for literature mining. The ontology-based literature mining, in combination with an INO-based statistical interaction enrichment test, provides a new platform for efficient mining and analysis of topic-specific gene interaction networks.

No MeSH data available.


Related in: MedlinePlus