Limits...
Semantically enabled and statistically supported biological hypothesis testing with tissue microarray databases.

Song YS, Park CH, Chung HJ, Shin H, Kim J, Kim JH - BMC Bioinformatics (2011)

Bottom Line: Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions.Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs.We believe that preliminary investigation before performing highly controlled experiment can be benefited.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Industrial & Information Systems Engineering, Ajou University, Suwon 443-749, Korea.

ABSTRACT

Background: Although many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions.

Methods: An application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs.

Results: When a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF.

Conclusions: We demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited.

Show MeSH

Related in: MedlinePlus

Ontology and data representation as RDF (a) A part of the ontology for main entities represented as RDF graphs. C (pale blue rectangles) represents classes, P (pink rectangles) properties, A anonymous nodes, empty arrows rdfs:subClassOf, pink arrows rdfs:domain, green arrows rdfs:range, and red arrows owl:unionOf. (b) Relationship between data tables in a source database (Xperanto-TMA) and data graphs represented as RDF graphs. Name spaces are omitted for simplicity. Violet ellipses represent classes, orange rectangles object properties, green rectangles data type properties, and dotted arrows class-instance relationships.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3044309&req=5

Figure 1: Ontology and data representation as RDF (a) A part of the ontology for main entities represented as RDF graphs. C (pale blue rectangles) represents classes, P (pink rectangles) properties, A anonymous nodes, empty arrows rdfs:subClassOf, pink arrows rdfs:domain, green arrows rdfs:range, and red arrows owl:unionOf. (b) Relationship between data tables in a source database (Xperanto-TMA) and data graphs represented as RDF graphs. Name spaces are omitted for simplicity. Violet ellipses represent classes, orange rectangles object properties, green rectangles data type properties, and dotted arrows class-instance relationships.

Mentions: To provide RDF with a framework and facilitate the process of integration, an ontology specific for our applications was developed. In ontology design, our previous studies to implement Xperanto and Xperanto-TMA were referred because semantics in TMA and DNA microarray experiments were already extensively analyzed in the previous works [12,14]. Based on the previous works, the ontology was expressed as OWL. Part of the ontology is shown in Fig. 1(a) (Name spaces are omitted for simplicity).


Semantically enabled and statistically supported biological hypothesis testing with tissue microarray databases.

Song YS, Park CH, Chung HJ, Shin H, Kim J, Kim JH - BMC Bioinformatics (2011)

Ontology and data representation as RDF (a) A part of the ontology for main entities represented as RDF graphs. C (pale blue rectangles) represents classes, P (pink rectangles) properties, A anonymous nodes, empty arrows rdfs:subClassOf, pink arrows rdfs:domain, green arrows rdfs:range, and red arrows owl:unionOf. (b) Relationship between data tables in a source database (Xperanto-TMA) and data graphs represented as RDF graphs. Name spaces are omitted for simplicity. Violet ellipses represent classes, orange rectangles object properties, green rectangles data type properties, and dotted arrows class-instance relationships.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3044309&req=5

Figure 1: Ontology and data representation as RDF (a) A part of the ontology for main entities represented as RDF graphs. C (pale blue rectangles) represents classes, P (pink rectangles) properties, A anonymous nodes, empty arrows rdfs:subClassOf, pink arrows rdfs:domain, green arrows rdfs:range, and red arrows owl:unionOf. (b) Relationship between data tables in a source database (Xperanto-TMA) and data graphs represented as RDF graphs. Name spaces are omitted for simplicity. Violet ellipses represent classes, orange rectangles object properties, green rectangles data type properties, and dotted arrows class-instance relationships.
Mentions: To provide RDF with a framework and facilitate the process of integration, an ontology specific for our applications was developed. In ontology design, our previous studies to implement Xperanto and Xperanto-TMA were referred because semantics in TMA and DNA microarray experiments were already extensively analyzed in the previous works [12,14]. Based on the previous works, the ontology was expressed as OWL. Part of the ontology is shown in Fig. 1(a) (Name spaces are omitted for simplicity).

Bottom Line: Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions.Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs.We believe that preliminary investigation before performing highly controlled experiment can be benefited.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Industrial & Information Systems Engineering, Ajou University, Suwon 443-749, Korea.

ABSTRACT

Background: Although many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions.

Methods: An application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs.

Results: When a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF.

Conclusions: We demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited.

Show MeSH
Related in: MedlinePlus