Limits...
GroupRank: rank candidate genes in PPI network by differentially expressed gene groups.

Wang Q, Zhang S, Pang S, Zhang M, Wang B, Liu Q, Li J - PLoS ONE (2014)

Bottom Line: Many cell activities are organized as a network, and genes are clustered into co-expressed groups if they have the same or closely related biological function or they are co-regulated.A candidate gene is ranked high using GroupRank if it is differentially expressed in disease and control or is close to differentially co-expressed groups in PPI network.Moreover, the functional analyses of the major contributing gene group in gene prioritization of kidney cancer verified that our algorithm GroupRank not only ranks disease genes efficiently but also could help us identify and understand possible mechanisms in important physiological and pathological processes of disease.

View Article: PubMed Central - PubMed

Affiliation: Department of Bioinformatics & Biostatistics, School of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.

ABSTRACT
Many cell activities are organized as a network, and genes are clustered into co-expressed groups if they have the same or closely related biological function or they are co-regulated. In this study, based on an assumption that a strong candidate disease gene is more likely close to gene groups in which all members coordinately differentially express than individual genes with differential expression, we developed a novel disease gene prioritization method GroupRank by integrating gene co-expression and differential expression information generated from microarray data as well as PPI network. A candidate gene is ranked high using GroupRank if it is differentially expressed in disease and control or is close to differentially co-expressed groups in PPI network. We tested our method on data sets of lung, kidney, leukemia and breast cancer. The results revealed GroupRank could efficiently prioritize disease genes with significantly improved AUC value in comparison to the previous method with no consideration of co-expressed gene groups in PPI network. Moreover, the functional analyses of the major contributing gene group in gene prioritization of kidney cancer verified that our algorithm GroupRank not only ranks disease genes efficiently but also could help us identify and understand possible mechanisms in important physiological and pathological processes of disease.

Show MeSH

Related in: MedlinePlus

Schematic graph of gene ranking of kidney cancer using GroupRank.The graph illustrates gene ranking of kidney cancer using the algorithm GroupRank. The triangle nodes at the top represent known kidney cancer genes and the square nodes at the bottom represent the top 20 ranked genes of kidney cancer using GroupRank. The circle nodes in middle represent the co-expressed gene groups used to rank disease gene candidates. A known or putative cancer gene is connected with a gene group if it contributes more than 5% of the summed ranking score of this cancer gene. The width of the edge linked to a disease gene is proportional to the scoring contribution obtained from the corresponding gene group. The edges explaining more than 20% of the ranking score of the cancer gene candidate are highlighted in dark blue. The edge is colored in light blue if the scoring contribution of the gene group is from 15% to 20%. The darker node color indicates higher fold change at expression level in cancer and normal control. The size of the circle node representing gene group was proportional to its accumulated contribution in ranking scores of all known kidney cancer genes. The enriched functional annotation is labeled on each of the four major contributing gene groups.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4199715&req=5

pone-0110406-g005: Schematic graph of gene ranking of kidney cancer using GroupRank.The graph illustrates gene ranking of kidney cancer using the algorithm GroupRank. The triangle nodes at the top represent known kidney cancer genes and the square nodes at the bottom represent the top 20 ranked genes of kidney cancer using GroupRank. The circle nodes in middle represent the co-expressed gene groups used to rank disease gene candidates. A known or putative cancer gene is connected with a gene group if it contributes more than 5% of the summed ranking score of this cancer gene. The width of the edge linked to a disease gene is proportional to the scoring contribution obtained from the corresponding gene group. The edges explaining more than 20% of the ranking score of the cancer gene candidate are highlighted in dark blue. The edge is colored in light blue if the scoring contribution of the gene group is from 15% to 20%. The darker node color indicates higher fold change at expression level in cancer and normal control. The size of the circle node representing gene group was proportional to its accumulated contribution in ranking scores of all known kidney cancer genes. The enriched functional annotation is labeled on each of the four major contributing gene groups.

Mentions: In the GroupRank algorithm, the co-expressed gene groups comprising the most significantly changed gene members in cancers and normal controls must play major roles in cancer. Looking at it from another angle, further study on those major contributing groups can help us to explore and understand why a candidate gene is listed in the top rank and which pathway or biological process is influenced by this disease gene candidate in the disease condition. In this paper, kidney cancer was taken as an example, and we investigated the gene groups, especially the major contributing groups in the ranking of the top 20 gene candidates and 21 known kidney cancer genes. As illustrated in Figure 5, based on the accumulated contributions in ranking scores of known tumor genes using GroupRank, four gene groups emerged by explaining 64.7% of the ranking scores of all 21 known kidney cancer genes. We found that the top 20 ranked genes also had strong connections with those four groups. That indicates that these four gene groups are closely related with kidney cancer. We did GO enrichment analysis of these groups using WebGestalt [24] and found that these gene groups, which were differentially expressed in kidney cancer, are involved in cell proliferation, protein binding, misfolded protein binding, and heat shock protein binding respectively (p-value<0.05, bonferroni multiple testing adjustment). It was reported by Short et al. (1993) that enhanced cell proliferation occurs at several stages of renal tumorigenesis [25]. Heat shock proteins (Hsps) are overexpressed in a wide range of human cancers and are implicated in tumor cell proliferation, differentiation, invasion, metastasis, death, and recognition by the immune system [26]. Misfolded proteins were also reported in the study of cancer, and targeted degradation of misfolded proteins has become one of the promising new therapeutic approaches in the treatment of cancer [27].


GroupRank: rank candidate genes in PPI network by differentially expressed gene groups.

Wang Q, Zhang S, Pang S, Zhang M, Wang B, Liu Q, Li J - PLoS ONE (2014)

Schematic graph of gene ranking of kidney cancer using GroupRank.The graph illustrates gene ranking of kidney cancer using the algorithm GroupRank. The triangle nodes at the top represent known kidney cancer genes and the square nodes at the bottom represent the top 20 ranked genes of kidney cancer using GroupRank. The circle nodes in middle represent the co-expressed gene groups used to rank disease gene candidates. A known or putative cancer gene is connected with a gene group if it contributes more than 5% of the summed ranking score of this cancer gene. The width of the edge linked to a disease gene is proportional to the scoring contribution obtained from the corresponding gene group. The edges explaining more than 20% of the ranking score of the cancer gene candidate are highlighted in dark blue. The edge is colored in light blue if the scoring contribution of the gene group is from 15% to 20%. The darker node color indicates higher fold change at expression level in cancer and normal control. The size of the circle node representing gene group was proportional to its accumulated contribution in ranking scores of all known kidney cancer genes. The enriched functional annotation is labeled on each of the four major contributing gene groups.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4199715&req=5

pone-0110406-g005: Schematic graph of gene ranking of kidney cancer using GroupRank.The graph illustrates gene ranking of kidney cancer using the algorithm GroupRank. The triangle nodes at the top represent known kidney cancer genes and the square nodes at the bottom represent the top 20 ranked genes of kidney cancer using GroupRank. The circle nodes in middle represent the co-expressed gene groups used to rank disease gene candidates. A known or putative cancer gene is connected with a gene group if it contributes more than 5% of the summed ranking score of this cancer gene. The width of the edge linked to a disease gene is proportional to the scoring contribution obtained from the corresponding gene group. The edges explaining more than 20% of the ranking score of the cancer gene candidate are highlighted in dark blue. The edge is colored in light blue if the scoring contribution of the gene group is from 15% to 20%. The darker node color indicates higher fold change at expression level in cancer and normal control. The size of the circle node representing gene group was proportional to its accumulated contribution in ranking scores of all known kidney cancer genes. The enriched functional annotation is labeled on each of the four major contributing gene groups.
Mentions: In the GroupRank algorithm, the co-expressed gene groups comprising the most significantly changed gene members in cancers and normal controls must play major roles in cancer. Looking at it from another angle, further study on those major contributing groups can help us to explore and understand why a candidate gene is listed in the top rank and which pathway or biological process is influenced by this disease gene candidate in the disease condition. In this paper, kidney cancer was taken as an example, and we investigated the gene groups, especially the major contributing groups in the ranking of the top 20 gene candidates and 21 known kidney cancer genes. As illustrated in Figure 5, based on the accumulated contributions in ranking scores of known tumor genes using GroupRank, four gene groups emerged by explaining 64.7% of the ranking scores of all 21 known kidney cancer genes. We found that the top 20 ranked genes also had strong connections with those four groups. That indicates that these four gene groups are closely related with kidney cancer. We did GO enrichment analysis of these groups using WebGestalt [24] and found that these gene groups, which were differentially expressed in kidney cancer, are involved in cell proliferation, protein binding, misfolded protein binding, and heat shock protein binding respectively (p-value<0.05, bonferroni multiple testing adjustment). It was reported by Short et al. (1993) that enhanced cell proliferation occurs at several stages of renal tumorigenesis [25]. Heat shock proteins (Hsps) are overexpressed in a wide range of human cancers and are implicated in tumor cell proliferation, differentiation, invasion, metastasis, death, and recognition by the immune system [26]. Misfolded proteins were also reported in the study of cancer, and targeted degradation of misfolded proteins has become one of the promising new therapeutic approaches in the treatment of cancer [27].

Bottom Line: Many cell activities are organized as a network, and genes are clustered into co-expressed groups if they have the same or closely related biological function or they are co-regulated.A candidate gene is ranked high using GroupRank if it is differentially expressed in disease and control or is close to differentially co-expressed groups in PPI network.Moreover, the functional analyses of the major contributing gene group in gene prioritization of kidney cancer verified that our algorithm GroupRank not only ranks disease genes efficiently but also could help us identify and understand possible mechanisms in important physiological and pathological processes of disease.

View Article: PubMed Central - PubMed

Affiliation: Department of Bioinformatics & Biostatistics, School of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.

ABSTRACT
Many cell activities are organized as a network, and genes are clustered into co-expressed groups if they have the same or closely related biological function or they are co-regulated. In this study, based on an assumption that a strong candidate disease gene is more likely close to gene groups in which all members coordinately differentially express than individual genes with differential expression, we developed a novel disease gene prioritization method GroupRank by integrating gene co-expression and differential expression information generated from microarray data as well as PPI network. A candidate gene is ranked high using GroupRank if it is differentially expressed in disease and control or is close to differentially co-expressed groups in PPI network. We tested our method on data sets of lung, kidney, leukemia and breast cancer. The results revealed GroupRank could efficiently prioritize disease genes with significantly improved AUC value in comparison to the previous method with no consideration of co-expressed gene groups in PPI network. Moreover, the functional analyses of the major contributing gene group in gene prioritization of kidney cancer verified that our algorithm GroupRank not only ranks disease genes efficiently but also could help us identify and understand possible mechanisms in important physiological and pathological processes of disease.

Show MeSH
Related in: MedlinePlus