Limits...
Prediction of human disease genes by human-mouse conserved coexpression analysis.

Ala U, Piro RM, Grassi E, Damasco C, Silengo L, Oti M, Provero P, Di Cunto F - PLoS Comput. Biol. (2008)

Bottom Line: However, so far, gene coexpression has not been used very successfully to prioritize positional candidates.Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions.Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.

View Article: PubMed Central - PubMed

Affiliation: Molecular Biotechnology Center, Department of Genetics, Biology and Biochemistry, University of Turin, Turin, Italy.

ABSTRACT

Background: Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates.

Methodology/principal findings: We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.

Conclusion: Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.

Show MeSH

Related in: MedlinePlus

Comparison of the Affy and Stanford networks with functional, physical interaction, and disease-related information.(A) Prevalence of functionally related coexpression clusters (see Materials and Methods). (B) Number of edges of the CCN joining proteins previously shown to physically interact by different techniques, as deduced by the HPRD database. (C) Number of edges of the indicated networks connecting genes involved in Mendelian phenotypes sharing a MimMiner score of 0.4 or higher. In each case, the results for the actual CCNs are compared to the results averaged on 100 randomized CCNs, with error bars representing the standard deviation of the latter.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2268251&req=5

pcbi-1000043-g002: Comparison of the Affy and Stanford networks with functional, physical interaction, and disease-related information.(A) Prevalence of functionally related coexpression clusters (see Materials and Methods). (B) Number of edges of the CCN joining proteins previously shown to physically interact by different techniques, as deduced by the HPRD database. (C) Number of edges of the indicated networks connecting genes involved in Mendelian phenotypes sharing a MimMiner score of 0.4 or higher. In each case, the results for the actual CCNs are compared to the results averaged on 100 randomized CCNs, with error bars representing the standard deviation of the latter.

Mentions: The analysis of the CCNs was based on the construction of coexpression clusters, defined as a given gene (the center of the cluster) plus its nearest neighbors in the conserved coexpression network, thus obtaining one cluster for each gene. The prevalence of genes joining functionally related genes in the CCNs was tested by analyzing the prevalence of Gene Ontology terms within coexpression clusters, compared to the same prevalence in randomized coexpression clusters. We counted the number of coexpression clusters for which at least one Gene Ontology term was significantly overrepresented (P-value less than 10−4 with exact Fisher test), and compared this number with the same number averaged over 100 randomized CCNs. This was done separately for the Affy and Stanford networks, and the results are shown in Figure 2A.


Prediction of human disease genes by human-mouse conserved coexpression analysis.

Ala U, Piro RM, Grassi E, Damasco C, Silengo L, Oti M, Provero P, Di Cunto F - PLoS Comput. Biol. (2008)

Comparison of the Affy and Stanford networks with functional, physical interaction, and disease-related information.(A) Prevalence of functionally related coexpression clusters (see Materials and Methods). (B) Number of edges of the CCN joining proteins previously shown to physically interact by different techniques, as deduced by the HPRD database. (C) Number of edges of the indicated networks connecting genes involved in Mendelian phenotypes sharing a MimMiner score of 0.4 or higher. In each case, the results for the actual CCNs are compared to the results averaged on 100 randomized CCNs, with error bars representing the standard deviation of the latter.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2268251&req=5

pcbi-1000043-g002: Comparison of the Affy and Stanford networks with functional, physical interaction, and disease-related information.(A) Prevalence of functionally related coexpression clusters (see Materials and Methods). (B) Number of edges of the CCN joining proteins previously shown to physically interact by different techniques, as deduced by the HPRD database. (C) Number of edges of the indicated networks connecting genes involved in Mendelian phenotypes sharing a MimMiner score of 0.4 or higher. In each case, the results for the actual CCNs are compared to the results averaged on 100 randomized CCNs, with error bars representing the standard deviation of the latter.
Mentions: The analysis of the CCNs was based on the construction of coexpression clusters, defined as a given gene (the center of the cluster) plus its nearest neighbors in the conserved coexpression network, thus obtaining one cluster for each gene. The prevalence of genes joining functionally related genes in the CCNs was tested by analyzing the prevalence of Gene Ontology terms within coexpression clusters, compared to the same prevalence in randomized coexpression clusters. We counted the number of coexpression clusters for which at least one Gene Ontology term was significantly overrepresented (P-value less than 10−4 with exact Fisher test), and compared this number with the same number averaged over 100 randomized CCNs. This was done separately for the Affy and Stanford networks, and the results are shown in Figure 2A.

Bottom Line: However, so far, gene coexpression has not been used very successfully to prioritize positional candidates.Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions.Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.

View Article: PubMed Central - PubMed

Affiliation: Molecular Biotechnology Center, Department of Genetics, Biology and Biochemistry, University of Turin, Turin, Italy.

ABSTRACT

Background: Even in the post-genomic era, the identification of candidate genes within loci associated with human genetic diseases is a very demanding task, because the critical region may typically contain hundreds of positional candidates. Since genes implicated in similar phenotypes tend to share very similar expression profiles, high throughput gene expression data may represent a very important resource to identify the best candidates for sequencing. However, so far, gene coexpression has not been used very successfully to prioritize positional candidates.

Methodology/principal findings: We show that it is possible to reliably identify disease-relevant relationships among genes from massive microarray datasets by concentrating only on genes sharing similar expression profiles in both human and mouse. Moreover, we show systematically that the integration of human-mouse conserved coexpression with a phenotype similarity map allows the efficient identification of disease genes in large genomic regions. Finally, using this approach on 850 OMIM loci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.

Conclusion: Our results demonstrate that conserved coexpression, even at the human-mouse phylogenetic distance, represents a very strong criterion to predict disease-relevant relationships among human genes.

Show MeSH
Related in: MedlinePlus