Limits...
Generating a focused view of disease ontology cancer terms for pan-cancer data integration and analysis.

Wu TJ, Schriml LM, Chen QR, Colbert M, Crichton DJ, Finney R, Hu Y, Kibbe WA, Kincaid H, Meerzaman D, Mitraka E, Pan Y, Smith KM, Srivastava S, Ward S, Yan C, Mazumder R - Database (Oxford) (2015)

Bottom Line: There are multiple initiatives that are developing biomedical terminologies for the purpose of providing better annotation, data integration and mining capabilities.The disease ontology (DO) was developed over the past decade to address data integration, standardization and annotation issues for human disease data.For example, the COSMIC term 'kidney, NS, carcinoma, clear_cell_renal_cell_carcinoma' and TCGA term 'Kidney renal clear cell carcinoma' were both grouped to the term 'Disease Ontology Identification (DOID):4467 / renal clear cell carcinoma' which was mapped to the TopNodes_DOcancerslim term 'DOID:263 / kidney cancer'.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry and Molecular Medicine, George Washington University, Washington, DC 20037, USA, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Center for Bioinformatics and Information Technology, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20892-9760, USA, NASA Jet Propulsion Laboratory, Pasadena, CA, USA, Division of Cancer Prevention, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20892-9760, USA, Wellcome Trust Sanger Institute, Cambridge, UK and McCormick Genomic and Proteomic Center, George Washington University, Washington, DC 20037, USA.

Show MeSH

Related in: MedlinePlus

DO cancer Circos plot showing the hierarchical structure of the system. All mapped subsumed terms (the innermost layer), TopNodes_DOcancerslim level terms (the middle layer) and child terms (the outermost layer) are plotted with the full DOIDs/terms listed. The top-level terms and child terms are available in the Supplementary Table S1. The summarized terms are derived from the level under cell type cancer and organ system cancer of DOID / cancer in DO.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4385274&req=5

bav032-F2: DO cancer Circos plot showing the hierarchical structure of the system. All mapped subsumed terms (the innermost layer), TopNodes_DOcancerslim level terms (the middle layer) and child terms (the outermost layer) are plotted with the full DOIDs/terms listed. The top-level terms and child terms are available in the Supplementary Table S1. The summarized terms are derived from the level under cell type cancer and organ system cancer of DOID / cancer in DO.

Mentions: The list of 386 cancer terms derived from all of the data sources mentioned earlier were mapped to terms within DO. The manual process of identifying the type of cancer term involved investigation of each cancer term to identify the current classification of each type of cancer from authoritative resources including primary publications, the NCI Dictionary of Cancer terms, NCI cancer topics and WHO classification. This process involved identifying the proper nomenclature for each term and developing a DO definition to describe the disease etiology. Each of these resources were then included as references for the term and definition provenance in DO. The four primary steps to map cancer terms from different data sites to DO terms were as follows: Step 1 involved identifying the name of the cancer term from the data source. For example, central_nervous_system, basal_ganglia, glioma, astrocytoma_Grade_IV, from COSMIC was grade IV astrocytoma. Step 2 involved identifying the current nomenclature for this disease as disease terms change over time. In this example, grade IV astrocytoma mapped to glioblastoma multiforme. Next, in Step 3 we identified if the cancer term existed in DO or if the term could be mapped to a synonym of a DO term or if the term was a novel term and should be added to DO. In this example, glioblastoma multiforme already existed in DO as Disease Ontology Identification (DOID):3068. Step 4 involved investigating the most appropriate definition of the term, and defining all parental terms linked to this primary term. Each term was mapped within DO to the most appropriate cell type or organ system cancer node. These steps resulted in 43 new terms being added to DO and the addition of definitions and references to 63 DO parent terms and 187 DO child node terms. This DO cancer hierarchical structure is presented in Figures 1 and 2 using tree and Circos plots (46). The generated hierarchy represents a cohesive set of DO terms that enables cancer terms to be mapped across cancer resources. Table 1 contains the terms listed in TOPNodes_DOcancerslim.obo file along with their number of ‘Children Nodes’ and ‘Source’ databases.Figure 1.


Generating a focused view of disease ontology cancer terms for pan-cancer data integration and analysis.

Wu TJ, Schriml LM, Chen QR, Colbert M, Crichton DJ, Finney R, Hu Y, Kibbe WA, Kincaid H, Meerzaman D, Mitraka E, Pan Y, Smith KM, Srivastava S, Ward S, Yan C, Mazumder R - Database (Oxford) (2015)

DO cancer Circos plot showing the hierarchical structure of the system. All mapped subsumed terms (the innermost layer), TopNodes_DOcancerslim level terms (the middle layer) and child terms (the outermost layer) are plotted with the full DOIDs/terms listed. The top-level terms and child terms are available in the Supplementary Table S1. The summarized terms are derived from the level under cell type cancer and organ system cancer of DOID / cancer in DO.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4385274&req=5

bav032-F2: DO cancer Circos plot showing the hierarchical structure of the system. All mapped subsumed terms (the innermost layer), TopNodes_DOcancerslim level terms (the middle layer) and child terms (the outermost layer) are plotted with the full DOIDs/terms listed. The top-level terms and child terms are available in the Supplementary Table S1. The summarized terms are derived from the level under cell type cancer and organ system cancer of DOID / cancer in DO.
Mentions: The list of 386 cancer terms derived from all of the data sources mentioned earlier were mapped to terms within DO. The manual process of identifying the type of cancer term involved investigation of each cancer term to identify the current classification of each type of cancer from authoritative resources including primary publications, the NCI Dictionary of Cancer terms, NCI cancer topics and WHO classification. This process involved identifying the proper nomenclature for each term and developing a DO definition to describe the disease etiology. Each of these resources were then included as references for the term and definition provenance in DO. The four primary steps to map cancer terms from different data sites to DO terms were as follows: Step 1 involved identifying the name of the cancer term from the data source. For example, central_nervous_system, basal_ganglia, glioma, astrocytoma_Grade_IV, from COSMIC was grade IV astrocytoma. Step 2 involved identifying the current nomenclature for this disease as disease terms change over time. In this example, grade IV astrocytoma mapped to glioblastoma multiforme. Next, in Step 3 we identified if the cancer term existed in DO or if the term could be mapped to a synonym of a DO term or if the term was a novel term and should be added to DO. In this example, glioblastoma multiforme already existed in DO as Disease Ontology Identification (DOID):3068. Step 4 involved investigating the most appropriate definition of the term, and defining all parental terms linked to this primary term. Each term was mapped within DO to the most appropriate cell type or organ system cancer node. These steps resulted in 43 new terms being added to DO and the addition of definitions and references to 63 DO parent terms and 187 DO child node terms. This DO cancer hierarchical structure is presented in Figures 1 and 2 using tree and Circos plots (46). The generated hierarchy represents a cohesive set of DO terms that enables cancer terms to be mapped across cancer resources. Table 1 contains the terms listed in TOPNodes_DOcancerslim.obo file along with their number of ‘Children Nodes’ and ‘Source’ databases.Figure 1.

Bottom Line: There are multiple initiatives that are developing biomedical terminologies for the purpose of providing better annotation, data integration and mining capabilities.The disease ontology (DO) was developed over the past decade to address data integration, standardization and annotation issues for human disease data.For example, the COSMIC term 'kidney, NS, carcinoma, clear_cell_renal_cell_carcinoma' and TCGA term 'Kidney renal clear cell carcinoma' were both grouped to the term 'Disease Ontology Identification (DOID):4467 / renal clear cell carcinoma' which was mapped to the TopNodes_DOcancerslim term 'DOID:263 / kidney cancer'.

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry and Molecular Medicine, George Washington University, Washington, DC 20037, USA, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Center for Bioinformatics and Information Technology, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20892-9760, USA, NASA Jet Propulsion Laboratory, Pasadena, CA, USA, Division of Cancer Prevention, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20892-9760, USA, Wellcome Trust Sanger Institute, Cambridge, UK and McCormick Genomic and Proteomic Center, George Washington University, Washington, DC 20037, USA.

Show MeSH
Related in: MedlinePlus