Limits...
Improved human disease candidate gene prioritization using mouse phenotype.

Chen J, Xu H, Aronow BJ, Jegga AG - BMC Bioinformatics (2007)

Bottom Line: The majority of common diseases are multi-factorial and modified by genetically and mechanistically complex polygenic interactions and environmental factors.High-throughput genome-wide studies like linkage analysis and gene expression profiling, tend to be most useful for classification and characterization but do not provide sufficient information to identify or prioritize specific disease causal genes.Extending on an earlier hypothesis that the majority of genes that impact or cause disease share membership in any of several functional relationships we, for the first time, show the utility of mouse phenotype data in human disease gene prioritization.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, USA. Jing.Chen@cchmc.org

ABSTRACT

Background: The majority of common diseases are multi-factorial and modified by genetically and mechanistically complex polygenic interactions and environmental factors. High-throughput genome-wide studies like linkage analysis and gene expression profiling, tend to be most useful for classification and characterization but do not provide sufficient information to identify or prioritize specific disease causal genes.

Results: Extending on an earlier hypothesis that the majority of genes that impact or cause disease share membership in any of several functional relationships we, for the first time, show the utility of mouse phenotype data in human disease gene prioritization. We study the effect of different data integration methods, and based on the validation studies, we show that our approach, ToppGene http://toppgene.cchmc.org, outperforms two of the existing candidate gene prioritization methods, SUSPECTS and ENDEAVOUR.

Conclusion: The incorporation of phenotype information for mouse orthologs of human genes greatly improves the human disease candidate gene analysis and prioritization.

Show MeSH

Related in: MedlinePlus

Schematic representation of gene prioritization. (A) Genes in the training set are selected based on their attributes or current gene annotations (genes associated with a disease, phenotype, pathway or a GO term). (B) Test gene source can be candidate genes from linkage analysis studies or genes differentially expressed in a particular disease or phenotype. (C) Enriched terms of the eight gene annotations, namely, GO: Molecular Function, GO: Biological Process, Mouse Phenotype, Pathways, Protein Interactions, Protein Domains and Gene Expression, compiled from various data sources, are obtained for the training set of genes. (D) A similarity score is generated for each annotation of each test gene by comparing to the enriched terms in the training set of genes. The final prioritized gene list is then computed based on the aggregated values of the eight similarity scores.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2194797&req=5

Figure 5: Schematic representation of gene prioritization. (A) Genes in the training set are selected based on their attributes or current gene annotations (genes associated with a disease, phenotype, pathway or a GO term). (B) Test gene source can be candidate genes from linkage analysis studies or genes differentially expressed in a particular disease or phenotype. (C) Enriched terms of the eight gene annotations, namely, GO: Molecular Function, GO: Biological Process, Mouse Phenotype, Pathways, Protein Interactions, Protein Domains and Gene Expression, compiled from various data sources, are obtained for the training set of genes. (D) A similarity score is generated for each annotation of each test gene by comparing to the enriched terms in the training set of genes. The final prioritized gene list is then computed based on the aggregated values of the eight similarity scores.

Mentions: We used seven data sources (6 human-related and 1 mouse-related) to prioritize the gene candidates (see Figure 5).


Improved human disease candidate gene prioritization using mouse phenotype.

Chen J, Xu H, Aronow BJ, Jegga AG - BMC Bioinformatics (2007)

Schematic representation of gene prioritization. (A) Genes in the training set are selected based on their attributes or current gene annotations (genes associated with a disease, phenotype, pathway or a GO term). (B) Test gene source can be candidate genes from linkage analysis studies or genes differentially expressed in a particular disease or phenotype. (C) Enriched terms of the eight gene annotations, namely, GO: Molecular Function, GO: Biological Process, Mouse Phenotype, Pathways, Protein Interactions, Protein Domains and Gene Expression, compiled from various data sources, are obtained for the training set of genes. (D) A similarity score is generated for each annotation of each test gene by comparing to the enriched terms in the training set of genes. The final prioritized gene list is then computed based on the aggregated values of the eight similarity scores.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2194797&req=5

Figure 5: Schematic representation of gene prioritization. (A) Genes in the training set are selected based on their attributes or current gene annotations (genes associated with a disease, phenotype, pathway or a GO term). (B) Test gene source can be candidate genes from linkage analysis studies or genes differentially expressed in a particular disease or phenotype. (C) Enriched terms of the eight gene annotations, namely, GO: Molecular Function, GO: Biological Process, Mouse Phenotype, Pathways, Protein Interactions, Protein Domains and Gene Expression, compiled from various data sources, are obtained for the training set of genes. (D) A similarity score is generated for each annotation of each test gene by comparing to the enriched terms in the training set of genes. The final prioritized gene list is then computed based on the aggregated values of the eight similarity scores.
Mentions: We used seven data sources (6 human-related and 1 mouse-related) to prioritize the gene candidates (see Figure 5).

Bottom Line: The majority of common diseases are multi-factorial and modified by genetically and mechanistically complex polygenic interactions and environmental factors.High-throughput genome-wide studies like linkage analysis and gene expression profiling, tend to be most useful for classification and characterization but do not provide sufficient information to identify or prioritize specific disease causal genes.Extending on an earlier hypothesis that the majority of genes that impact or cause disease share membership in any of several functional relationships we, for the first time, show the utility of mouse phenotype data in human disease gene prioritization.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, USA. Jing.Chen@cchmc.org

ABSTRACT

Background: The majority of common diseases are multi-factorial and modified by genetically and mechanistically complex polygenic interactions and environmental factors. High-throughput genome-wide studies like linkage analysis and gene expression profiling, tend to be most useful for classification and characterization but do not provide sufficient information to identify or prioritize specific disease causal genes.

Results: Extending on an earlier hypothesis that the majority of genes that impact or cause disease share membership in any of several functional relationships we, for the first time, show the utility of mouse phenotype data in human disease gene prioritization. We study the effect of different data integration methods, and based on the validation studies, we show that our approach, ToppGene http://toppgene.cchmc.org, outperforms two of the existing candidate gene prioritization methods, SUSPECTS and ENDEAVOUR.

Conclusion: The incorporation of phenotype information for mouse orthologs of human genes greatly improves the human disease candidate gene analysis and prioritization.

Show MeSH
Related in: MedlinePlus