Limits...
Finding gene regulatory network candidates using the gene expression knowledge base.

Venkatesan A, Tripathi S, Sanz de Galdeano A, Blondé W, Lægreid A, Mironov V, Kuiper M - BMC Bioinformatics (2014)

Bottom Line: Semantic web technologies provide the means for processing and integrating various heterogeneous information sources.The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression.This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology, Norwegian University of Science and Technology (NTNU), N-7491, Trondheim, Norway. aravind.venkatesan@ntnu.no.

ABSTRACT

Background: Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis.

Results: We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions.

Conclusions: Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

Show MeSH

Related in: MedlinePlus

Result evaluation. The flowchart illustrates the evaluation of the results returned for the use cases I through IV. The proteins retrieved for use cases I, II and III were first classified based on their presence in the CCK2R map, constituting two groups a and b. The proteins under group b were further evaluated based on evidence of gastrin induced regulation constituting sub-group b1. Proteins in b1 were prioritized based on literature evidence implicating them to respond to stimuli other than gastrin (b1i ), and proteins not reported to be responsive to other stimuli (b1j). Proteins qualifying both as b1 and b1i were considered to be the most promising new putative network members. Similarly, the target genes returned for use case IV were evaluated for their expression in the AR42J cell system and whether these target genes were gastrin responsive. Genes that satisfied both criteria were prioritized as putative network members.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4279962&req=5

Fig6: Result evaluation. The flowchart illustrates the evaluation of the results returned for the use cases I through IV. The proteins retrieved for use cases I, II and III were first classified based on their presence in the CCK2R map, constituting two groups a and b. The proteins under group b were further evaluated based on evidence of gastrin induced regulation constituting sub-group b1. Proteins in b1 were prioritized based on literature evidence implicating them to respond to stimuli other than gastrin (b1i ), and proteins not reported to be responsive to other stimuli (b1j). Proteins qualifying both as b1 and b1i were considered to be the most promising new putative network members. Similarly, the target genes returned for use case IV were evaluated for their expression in the AR42J cell system and whether these target genes were gastrin responsive. Genes that satisfied both criteria were prioritized as putative network members.

Mentions: The results returned for uses cases I through III were investigated for their relevance to the gastrin response network [21] by categorizing them into two disjoint sets: a) proteins that have already been documented as members of the gastrin response network, and b) potential novel components of the gastrin response network. Within the latter a subset of regulators responsive to gastrin, referred to as b1 below, was identified on the basis of transcriptomic data from a 14h time series gastrin response data set [19]. Within b1 two disjoint subsets were defined – proteins known to be responsive to stimuli other than gastrin, and those not known, designated b1i and b1j respectively. The purpose of this classification was to prioritize the putative components. For instance, b1i proteins were given higher priority as new putative members of the gastrin response network members due to the available evidence from literature, whereas proteins in category b1j are still potentially interesting for future laboratory work, but with a lower priority. Finally, in use case IV the results returned for Q6 were assessed based on whether the genes regulated by the DbTFs in the query are expressed in the AR42J cell line and whether their expression changed in response to gastrin stimulation (see Figure 6). The six SPARQL queries and the results of use cases I - III are available in the Additional files 2 and 3 respectively.Figure 6


Finding gene regulatory network candidates using the gene expression knowledge base.

Venkatesan A, Tripathi S, Sanz de Galdeano A, Blondé W, Lægreid A, Mironov V, Kuiper M - BMC Bioinformatics (2014)

Result evaluation. The flowchart illustrates the evaluation of the results returned for the use cases I through IV. The proteins retrieved for use cases I, II and III were first classified based on their presence in the CCK2R map, constituting two groups a and b. The proteins under group b were further evaluated based on evidence of gastrin induced regulation constituting sub-group b1. Proteins in b1 were prioritized based on literature evidence implicating them to respond to stimuli other than gastrin (b1i ), and proteins not reported to be responsive to other stimuli (b1j). Proteins qualifying both as b1 and b1i were considered to be the most promising new putative network members. Similarly, the target genes returned for use case IV were evaluated for their expression in the AR42J cell system and whether these target genes were gastrin responsive. Genes that satisfied both criteria were prioritized as putative network members.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4279962&req=5

Fig6: Result evaluation. The flowchart illustrates the evaluation of the results returned for the use cases I through IV. The proteins retrieved for use cases I, II and III were first classified based on their presence in the CCK2R map, constituting two groups a and b. The proteins under group b were further evaluated based on evidence of gastrin induced regulation constituting sub-group b1. Proteins in b1 were prioritized based on literature evidence implicating them to respond to stimuli other than gastrin (b1i ), and proteins not reported to be responsive to other stimuli (b1j). Proteins qualifying both as b1 and b1i were considered to be the most promising new putative network members. Similarly, the target genes returned for use case IV were evaluated for their expression in the AR42J cell system and whether these target genes were gastrin responsive. Genes that satisfied both criteria were prioritized as putative network members.
Mentions: The results returned for uses cases I through III were investigated for their relevance to the gastrin response network [21] by categorizing them into two disjoint sets: a) proteins that have already been documented as members of the gastrin response network, and b) potential novel components of the gastrin response network. Within the latter a subset of regulators responsive to gastrin, referred to as b1 below, was identified on the basis of transcriptomic data from a 14h time series gastrin response data set [19]. Within b1 two disjoint subsets were defined – proteins known to be responsive to stimuli other than gastrin, and those not known, designated b1i and b1j respectively. The purpose of this classification was to prioritize the putative components. For instance, b1i proteins were given higher priority as new putative members of the gastrin response network members due to the available evidence from literature, whereas proteins in category b1j are still potentially interesting for future laboratory work, but with a lower priority. Finally, in use case IV the results returned for Q6 were assessed based on whether the genes regulated by the DbTFs in the query are expressed in the AR42J cell line and whether their expression changed in response to gastrin stimulation (see Figure 6). The six SPARQL queries and the results of use cases I - III are available in the Additional files 2 and 3 respectively.Figure 6

Bottom Line: Semantic web technologies provide the means for processing and integrating various heterogeneous information sources.The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression.This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology, Norwegian University of Science and Technology (NTNU), N-7491, Trondheim, Norway. aravind.venkatesan@ntnu.no.

ABSTRACT

Background: Network-based approaches for the analysis of large-scale genomics data have become well established. Biological networks provide a knowledge scaffold against which the patterns and dynamics of 'omics' data can be interpreted. The background information required for the construction of such networks is often dispersed across a multitude of knowledge bases in a variety of formats. The seamless integration of this information is one of the main challenges in bioinformatics. The Semantic Web offers powerful technologies for the assembly of integrated knowledge bases that are computationally comprehensible, thereby providing a potentially powerful resource for constructing biological networks and network-based analysis.

Results: We have developed the Gene eXpression Knowledge Base (GeXKB), a semantic web technology based resource that contains integrated knowledge about gene expression regulation. To affirm the utility of GeXKB we demonstrate how this resource can be exploited for the identification of candidate regulatory network proteins. We present four use cases that were designed from a biological perspective in order to find candidate members relevant for the gastrin hormone signaling network model. We show how a combination of specific query definitions and additional selection criteria derived from gene expression data and prior knowledge concerning candidate proteins can be used to retrieve a set of proteins that constitute valid candidates for regulatory network extensions.

Conclusions: Semantic web technologies provide the means for processing and integrating various heterogeneous information sources. The GeXKB offers biologists such an integrated knowledge resource, allowing them to address complex biological questions pertaining to gene expression. This work illustrates how GeXKB can be used in combination with gene expression results and literature information to identify new potential candidates that may be considered for extending a gene regulatory network.

Show MeSH
Related in: MedlinePlus