Limits...
Integration of extracellular RNA profiling data using metadata, biomedical ontologies and Linked Data technologies.

Subramanian SL, Kitchen RR, Alexander R, Carter BS, Cheung KH, Laurent LC, Pico A, Roberts LR, Roth ME, Rozowsky JS, Su AI, Gerstein MB, Milosavljevic A - J Extracell Vesicles (2015)

Bottom Line: The large diversity and volume of extracellular RNA (exRNA) data that will form the basis of the exRNA Atlas generated by the Extracellular RNA Communication Consortium pose a substantial data integration challenge.We focus on the following three specific data integration tasks: (a) selection of samples from a virtual biorepository for exRNA profiling and for inclusion in the exRNA Atlas; (b) retrieval of a data slice from the exRNA Atlas for integrative analysis and (c) interpretation of exRNA analysis results in the context of pathways and networks.As exRNA profiling gains wide adoption in the research community, we anticipate that the strategies discussed here will increasingly be required to enable data reuse and to facilitate integrative analysis of exRNA data.

View Article: PubMed Central - PubMed

Affiliation: Bioinformatics Research Laboratory, Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA.

ABSTRACT
The large diversity and volume of extracellular RNA (exRNA) data that will form the basis of the exRNA Atlas generated by the Extracellular RNA Communication Consortium pose a substantial data integration challenge. We here present the strategy that is being implemented by the exRNA Data Management and Resource Repository, which employs metadata, biomedical ontologies and Linked Data technologies, such as Resource Description Framework to integrate a diverse set of exRNA profiles into an exRNA Atlas and enable integrative exRNA analysis. We focus on the following three specific data integration tasks: (a) selection of samples from a virtual biorepository for exRNA profiling and for inclusion in the exRNA Atlas; (b) retrieval of a data slice from the exRNA Atlas for integrative analysis and (c) interpretation of exRNA analysis results in the context of pathways and networks. As exRNA profiling gains wide adoption in the research community, we anticipate that the strategies discussed here will increasingly be required to enable data reuse and to facilitate integrative analysis of exRNA data.

No MeSH data available.


Related in: MedlinePlus

Data slicing and pathway enrichment analysis. This illustration is based on a hypothetical example of sequencing-based exRNA profiling of cerebrospinal fluid (CSF) from a brain tumour patient. Based on metadata about the selected samples, (a) “data slice” is extracted for further downstream analysis using pathway/network modules to detect activation of a metastatic brain tumour pathway. Panel b details selection of samples for profiling and inclusion in the exRNA Atlas using sample (CSF) and disease (CNS neoplasm) ontology traversals. Panel c details sequencing assay selection process using assay and experiment ontology traversals. The highlighted ontologies “CNS neoplasm” and “sequencing assay” are examples of terms that occur within an “ontology slim.” Ontology traversal in panel d identifies RNA species of interest. (a) “Data slice” defined by selections (b–d) is analysed to obtain a set of exRNA genes that show a pattern of coordinated changes. The metastatic brain tumour pathway (www.wikipathways.org/index.php/Pathway:WP2249) in panel e shows enrichment for the exRNA genes overexpressed in this hypothetical case.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4553261&req=5

Figure 0001: Data slicing and pathway enrichment analysis. This illustration is based on a hypothetical example of sequencing-based exRNA profiling of cerebrospinal fluid (CSF) from a brain tumour patient. Based on metadata about the selected samples, (a) “data slice” is extracted for further downstream analysis using pathway/network modules to detect activation of a metastatic brain tumour pathway. Panel b details selection of samples for profiling and inclusion in the exRNA Atlas using sample (CSF) and disease (CNS neoplasm) ontology traversals. Panel c details sequencing assay selection process using assay and experiment ontology traversals. The highlighted ontologies “CNS neoplasm” and “sequencing assay” are examples of terms that occur within an “ontology slim.” Ontology traversal in panel d identifies RNA species of interest. (a) “Data slice” defined by selections (b–d) is analysed to obtain a set of exRNA genes that show a pattern of coordinated changes. The metastatic brain tumour pathway (www.wikipathways.org/index.php/Pathway:WP2249) in panel e shows enrichment for the exRNA genes overexpressed in this hypothetical case.

Mentions: Both metadata and ontologies fall within the broad category of approaches to data integration that also includes Linked Data technologies such as RDF (Resource Description Framework; www.w3.org/RDF/). The Consortium aims to develop an RDF knowledge base about pathways and network modules of relevance for exRNA biology that will inform interpretation of exRNA profiling data. In the following, we review a strategy to employ metadata, ontology-based reasoning and RDF to integrate and analyse exRNA profiling data, focusing on the three tasks highlighted in Fig. 1a.


Integration of extracellular RNA profiling data using metadata, biomedical ontologies and Linked Data technologies.

Subramanian SL, Kitchen RR, Alexander R, Carter BS, Cheung KH, Laurent LC, Pico A, Roberts LR, Roth ME, Rozowsky JS, Su AI, Gerstein MB, Milosavljevic A - J Extracell Vesicles (2015)

Data slicing and pathway enrichment analysis. This illustration is based on a hypothetical example of sequencing-based exRNA profiling of cerebrospinal fluid (CSF) from a brain tumour patient. Based on metadata about the selected samples, (a) “data slice” is extracted for further downstream analysis using pathway/network modules to detect activation of a metastatic brain tumour pathway. Panel b details selection of samples for profiling and inclusion in the exRNA Atlas using sample (CSF) and disease (CNS neoplasm) ontology traversals. Panel c details sequencing assay selection process using assay and experiment ontology traversals. The highlighted ontologies “CNS neoplasm” and “sequencing assay” are examples of terms that occur within an “ontology slim.” Ontology traversal in panel d identifies RNA species of interest. (a) “Data slice” defined by selections (b–d) is analysed to obtain a set of exRNA genes that show a pattern of coordinated changes. The metastatic brain tumour pathway (www.wikipathways.org/index.php/Pathway:WP2249) in panel e shows enrichment for the exRNA genes overexpressed in this hypothetical case.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4553261&req=5

Figure 0001: Data slicing and pathway enrichment analysis. This illustration is based on a hypothetical example of sequencing-based exRNA profiling of cerebrospinal fluid (CSF) from a brain tumour patient. Based on metadata about the selected samples, (a) “data slice” is extracted for further downstream analysis using pathway/network modules to detect activation of a metastatic brain tumour pathway. Panel b details selection of samples for profiling and inclusion in the exRNA Atlas using sample (CSF) and disease (CNS neoplasm) ontology traversals. Panel c details sequencing assay selection process using assay and experiment ontology traversals. The highlighted ontologies “CNS neoplasm” and “sequencing assay” are examples of terms that occur within an “ontology slim.” Ontology traversal in panel d identifies RNA species of interest. (a) “Data slice” defined by selections (b–d) is analysed to obtain a set of exRNA genes that show a pattern of coordinated changes. The metastatic brain tumour pathway (www.wikipathways.org/index.php/Pathway:WP2249) in panel e shows enrichment for the exRNA genes overexpressed in this hypothetical case.
Mentions: Both metadata and ontologies fall within the broad category of approaches to data integration that also includes Linked Data technologies such as RDF (Resource Description Framework; www.w3.org/RDF/). The Consortium aims to develop an RDF knowledge base about pathways and network modules of relevance for exRNA biology that will inform interpretation of exRNA profiling data. In the following, we review a strategy to employ metadata, ontology-based reasoning and RDF to integrate and analyse exRNA profiling data, focusing on the three tasks highlighted in Fig. 1a.

Bottom Line: The large diversity and volume of extracellular RNA (exRNA) data that will form the basis of the exRNA Atlas generated by the Extracellular RNA Communication Consortium pose a substantial data integration challenge.We focus on the following three specific data integration tasks: (a) selection of samples from a virtual biorepository for exRNA profiling and for inclusion in the exRNA Atlas; (b) retrieval of a data slice from the exRNA Atlas for integrative analysis and (c) interpretation of exRNA analysis results in the context of pathways and networks.As exRNA profiling gains wide adoption in the research community, we anticipate that the strategies discussed here will increasingly be required to enable data reuse and to facilitate integrative analysis of exRNA data.

View Article: PubMed Central - PubMed

Affiliation: Bioinformatics Research Laboratory, Department of Molecular & Human Genetics, Baylor College of Medicine, Houston, TX, USA.

ABSTRACT
The large diversity and volume of extracellular RNA (exRNA) data that will form the basis of the exRNA Atlas generated by the Extracellular RNA Communication Consortium pose a substantial data integration challenge. We here present the strategy that is being implemented by the exRNA Data Management and Resource Repository, which employs metadata, biomedical ontologies and Linked Data technologies, such as Resource Description Framework to integrate a diverse set of exRNA profiles into an exRNA Atlas and enable integrative exRNA analysis. We focus on the following three specific data integration tasks: (a) selection of samples from a virtual biorepository for exRNA profiling and for inclusion in the exRNA Atlas; (b) retrieval of a data slice from the exRNA Atlas for integrative analysis and (c) interpretation of exRNA analysis results in the context of pathways and networks. As exRNA profiling gains wide adoption in the research community, we anticipate that the strategies discussed here will increasingly be required to enable data reuse and to facilitate integrative analysis of exRNA data.

No MeSH data available.


Related in: MedlinePlus