Limits...
Semantic Web-based integration of cancer pathways and allele frequency data.

Holford ME, Rajeevan H, Zhao H, Kidd KK, Cheung KH - Cancer Inform (2009)

Bottom Line: The ability to perform queries across the domains of population genetics and pathways offers the potential to answer a number of cancer-related research questions.This sort of information could be useful for designing clinical studies and for providing background data in personalized medicine.It could also assist with the interpretation of genetic analysis results such as those from genome-wide association studies.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, USA.

ABSTRACT
We demonstrate the use of Semantic Web technology to integrate the ALFRED allele frequency database and the Starpath pathway resource. The linking of population-specific genotype data with cancer-related pathway data is potentially useful given the growing interest in personalized medicine and the exploitation of pathway knowledge for cancer drug discovery. We model our data using the Web Ontology Language (OWL), drawing upon ideas from existing standard formats BioPAX for pathway data and PML for allele frequency data. We store our data within an Oracle database, using Oracle Semantic Technologies. We then query the data using Oracle's rule-based inference engine and SPARQL-like RDF query language. The ability to perform queries across the domains of population genetics and pathways offers the potential to answer a number of cancer-related research questions. Among the possibilities is the ability to identify genetic variants which are associated with cancer pathways and whose frequency varies significantly between ethnic groups. This sort of information could be useful for designing clinical studies and for providing background data in personalized medicine. It could also assist with the interpretation of genetic analysis results such as those from genome-wide association studies.

No MeSH data available.


Related in: MedlinePlus

The ALFRED database schema.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC2664696&req=5

f1-cin-08-19: The ALFRED database schema.

Mentions: ALFRED, the allele frequency database, provides allele frequency data for anthropologically defined human population samples.16 It contains both public data from literature and unpublished data from our host research laboratory and its collaborators. For data derived from literature, we tried to select those polymorphisms which have been studied in a wide variety of populations. ALFRED covers a broader spectrum of anthropologically defined populations than HapMap,17 another frequently sited source of allele frequency data. Over 95% of the polymorphisms in ALFRED have frequency data from more than 10 different populations. This is without considering the samples from different regions within the same population. We implemented ALFRED using a traditional relational structure which is illustrated in Figure 1. An individual polymorphism (or Site) is contained within a locus on the genome. Ethnic populations are organized by their geographic location (Geographic_Region). Multiple samples may be drawn from a particular population. For such highly heterogeneous populations as African American or European American, special care is taken to delineate the specific geographic region of the population. Population samples are typed to determine the frequency of alleles at a site. The Typed_Sample table bridges samples and polymorphisms and also associates the typing method, which is detailed in the Typing_Method table. The allele frequency values for a Typed_Sample are stored in the Frequencies table. Information about the contributor of particular allele frequency data is kept in the Contributors Table.


Semantic Web-based integration of cancer pathways and allele frequency data.

Holford ME, Rajeevan H, Zhao H, Kidd KK, Cheung KH - Cancer Inform (2009)

The ALFRED database schema.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC2664696&req=5

f1-cin-08-19: The ALFRED database schema.
Mentions: ALFRED, the allele frequency database, provides allele frequency data for anthropologically defined human population samples.16 It contains both public data from literature and unpublished data from our host research laboratory and its collaborators. For data derived from literature, we tried to select those polymorphisms which have been studied in a wide variety of populations. ALFRED covers a broader spectrum of anthropologically defined populations than HapMap,17 another frequently sited source of allele frequency data. Over 95% of the polymorphisms in ALFRED have frequency data from more than 10 different populations. This is without considering the samples from different regions within the same population. We implemented ALFRED using a traditional relational structure which is illustrated in Figure 1. An individual polymorphism (or Site) is contained within a locus on the genome. Ethnic populations are organized by their geographic location (Geographic_Region). Multiple samples may be drawn from a particular population. For such highly heterogeneous populations as African American or European American, special care is taken to delineate the specific geographic region of the population. Population samples are typed to determine the frequency of alleles at a site. The Typed_Sample table bridges samples and polymorphisms and also associates the typing method, which is detailed in the Typing_Method table. The allele frequency values for a Typed_Sample are stored in the Frequencies table. Information about the contributor of particular allele frequency data is kept in the Contributors Table.

Bottom Line: The ability to perform queries across the domains of population genetics and pathways offers the potential to answer a number of cancer-related research questions.This sort of information could be useful for designing clinical studies and for providing background data in personalized medicine.It could also assist with the interpretation of genetic analysis results such as those from genome-wide association studies.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, USA.

ABSTRACT
We demonstrate the use of Semantic Web technology to integrate the ALFRED allele frequency database and the Starpath pathway resource. The linking of population-specific genotype data with cancer-related pathway data is potentially useful given the growing interest in personalized medicine and the exploitation of pathway knowledge for cancer drug discovery. We model our data using the Web Ontology Language (OWL), drawing upon ideas from existing standard formats BioPAX for pathway data and PML for allele frequency data. We store our data within an Oracle database, using Oracle Semantic Technologies. We then query the data using Oracle's rule-based inference engine and SPARQL-like RDF query language. The ability to perform queries across the domains of population genetics and pathways offers the potential to answer a number of cancer-related research questions. Among the possibilities is the ability to identify genetic variants which are associated with cancer pathways and whose frequency varies significantly between ethnic groups. This sort of information could be useful for designing clinical studies and for providing background data in personalized medicine. It could also assist with the interpretation of genetic analysis results such as those from genome-wide association studies.

No MeSH data available.


Related in: MedlinePlus