Limits...
GAGE: generally applicable gene set enrichment for pathway analysis.

Luo W, Friedman MS, Shedden K, Hankenson KD, Woolf PJ - BMC Bioinformatics (2009)

Bottom Line: GSA focuses on sets of related genes and has established major advantages over individual gene analyses, including greater robustness, sensitivity and biological relevance.We successfully apply GAGE to multiple microarray datasets with different sample sizes, experimental designs and profiling techniques.GAGE consistently outperformed two most frequently used GSA methods and inferred statistically and biologically more relevant regulatory pathways.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA. luo@cshl.edu

ABSTRACT

Background: Gene set analysis (GSA) is a widely used strategy for gene expression data analysis based on pathway knowledge. GSA focuses on sets of related genes and has established major advantages over individual gene analyses, including greater robustness, sensitivity and biological relevance. However, previous GSA methods have limited usage as they cannot handle datasets of different sample sizes or experimental designs.

Results: To address these limitations, we present a new GSA method called Generally Applicable Gene-set Enrichment (GAGE). We successfully apply GAGE to multiple microarray datasets with different sample sizes, experimental designs and profiling techniques. GAGE shows significantly better results when compared to two other commonly used GSA methods of GSEA and PAGE. We demonstrate this improvement in the following three aspects: (1) consistency across repeated studies/experiments; (2) sensitivity and specificity; (3) biological relevance of the regulatory mechanisms inferred.GAGE reveals novel and relevant regulatory mechanisms from both published and previously unpublished microarray studies. From two published lung cancer data sets, GAGE derived a more cohesive and predictive mechanistic scheme underlying lung cancer progress and metastasis. For a previously unpublished BMP6 study, GAGE predicted novel regulatory mechanisms for BMP6 induced osteoblast differentiation, including the canonical BMP-TGF beta signaling, JAK-STAT signaling, Wnt signaling, and estrogen signaling pathways-all of which are supported by the experimental literature.

Conclusion: GAGE is generally applicable to gene expression datasets with different sample sizes and experimental designs. GAGE consistently outperformed two most frequently used GSA methods and inferred statistically and biologically more relevant regulatory pathways. The GAGE method is implemented in R in the "gage" package, available under the GNU GPL from http://sysbio.engin.umich.edu/~luow/downloads.php.

Show MeSH

Related in: MedlinePlus

GAGE captured canonical pathways which are significantly perturbed towards both directions following 8 h BMP6 treatment in human MSC. (a) Gene expression level changes in the top 3 different significant canonical pathways inferred by GAGE and PAGE. (b) Gene expression level changes in the canonical TGF beta signaling pathway and (c) plotted in pseudo-color on the pathway topology derived from KEGG database. The solid horizontal line and dashed lines in (a-b) mark the mean fold changes of all genes and the positive/negative two times standard deviation from the mean respectively. Note that in (c), one KEGG node may correspond to multiple closely related genes with the same function, and the maximum fold changes among these genes are plotted as the color of the node.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2696452&req=5

Figure 3: GAGE captured canonical pathways which are significantly perturbed towards both directions following 8 h BMP6 treatment in human MSC. (a) Gene expression level changes in the top 3 different significant canonical pathways inferred by GAGE and PAGE. (b) Gene expression level changes in the canonical TGF beta signaling pathway and (c) plotted in pseudo-color on the pathway topology derived from KEGG database. The solid horizontal line and dashed lines in (a-b) mark the mean fold changes of all genes and the positive/negative two times standard deviation from the mean respectively. Note that in (c), one KEGG node may correspond to multiple closely related genes with the same function, and the maximum fold changes among these genes are plotted as the color of the node.

Mentions: In contrast to PAGE and GSEA, GAGE separates canonical pathways from experimental sets and considers potential perturbations in both directions (i.e. up and down regulation simultaneously) in canonical pathways. Expression data directly showed that genes in the most relevant canonical pathways are regulated in both directions (Figure 3). Figure 3a shows the gene expression level changes following BMP6 treatment in top 3 different significant canonical pathways inferred by GAGE and PAGE (Table 3 and Additional file 1: Supplementary Table 8). These canonical pathways inferred by GAGE are directly related to BMP induced osteoblast differentiation [34,35] (Alk pathway is essentially TGF Beta signaling + Wnt signaling). Figure 3b shows the gene expression level changes in the TGF beta-BMP signaling pathway following BMP6 treatment. This pathway is a presumable gold standard as it is the primary signal triggered directly by BMPs (KEGG). The changes of gene expression are not uniform. The TGF-beta pathway includes both positive effectors such as BMPs, BMPR1–2, SMAD1/5/8, ID1–4, and THBS, and negative effectors such as NOG, SMAD2/3, and SMAD6/7. Clearly, both types of effectors were regulated up and down. Genes are regulated in both directions not only for the whole pathway but also within the sub-pathways like BMP or TGF-beta signaling branches. These results demonstrate that genes in canonical pathways are frequently up- and down-regulated simultaneously because 1) they play positive or negative roles [20] and 2) homeostatic mechanisms tend to bring a certain level of balance back to the system when it is perturbed [19]. Therefore, it is necessary to treat canonical pathways differently from experimental sets and count both up and down regulation when doing gene set analyses.


GAGE: generally applicable gene set enrichment for pathway analysis.

Luo W, Friedman MS, Shedden K, Hankenson KD, Woolf PJ - BMC Bioinformatics (2009)

GAGE captured canonical pathways which are significantly perturbed towards both directions following 8 h BMP6 treatment in human MSC. (a) Gene expression level changes in the top 3 different significant canonical pathways inferred by GAGE and PAGE. (b) Gene expression level changes in the canonical TGF beta signaling pathway and (c) plotted in pseudo-color on the pathway topology derived from KEGG database. The solid horizontal line and dashed lines in (a-b) mark the mean fold changes of all genes and the positive/negative two times standard deviation from the mean respectively. Note that in (c), one KEGG node may correspond to multiple closely related genes with the same function, and the maximum fold changes among these genes are plotted as the color of the node.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2696452&req=5

Figure 3: GAGE captured canonical pathways which are significantly perturbed towards both directions following 8 h BMP6 treatment in human MSC. (a) Gene expression level changes in the top 3 different significant canonical pathways inferred by GAGE and PAGE. (b) Gene expression level changes in the canonical TGF beta signaling pathway and (c) plotted in pseudo-color on the pathway topology derived from KEGG database. The solid horizontal line and dashed lines in (a-b) mark the mean fold changes of all genes and the positive/negative two times standard deviation from the mean respectively. Note that in (c), one KEGG node may correspond to multiple closely related genes with the same function, and the maximum fold changes among these genes are plotted as the color of the node.
Mentions: In contrast to PAGE and GSEA, GAGE separates canonical pathways from experimental sets and considers potential perturbations in both directions (i.e. up and down regulation simultaneously) in canonical pathways. Expression data directly showed that genes in the most relevant canonical pathways are regulated in both directions (Figure 3). Figure 3a shows the gene expression level changes following BMP6 treatment in top 3 different significant canonical pathways inferred by GAGE and PAGE (Table 3 and Additional file 1: Supplementary Table 8). These canonical pathways inferred by GAGE are directly related to BMP induced osteoblast differentiation [34,35] (Alk pathway is essentially TGF Beta signaling + Wnt signaling). Figure 3b shows the gene expression level changes in the TGF beta-BMP signaling pathway following BMP6 treatment. This pathway is a presumable gold standard as it is the primary signal triggered directly by BMPs (KEGG). The changes of gene expression are not uniform. The TGF-beta pathway includes both positive effectors such as BMPs, BMPR1–2, SMAD1/5/8, ID1–4, and THBS, and negative effectors such as NOG, SMAD2/3, and SMAD6/7. Clearly, both types of effectors were regulated up and down. Genes are regulated in both directions not only for the whole pathway but also within the sub-pathways like BMP or TGF-beta signaling branches. These results demonstrate that genes in canonical pathways are frequently up- and down-regulated simultaneously because 1) they play positive or negative roles [20] and 2) homeostatic mechanisms tend to bring a certain level of balance back to the system when it is perturbed [19]. Therefore, it is necessary to treat canonical pathways differently from experimental sets and count both up and down regulation when doing gene set analyses.

Bottom Line: GSA focuses on sets of related genes and has established major advantages over individual gene analyses, including greater robustness, sensitivity and biological relevance.We successfully apply GAGE to multiple microarray datasets with different sample sizes, experimental designs and profiling techniques.GAGE consistently outperformed two most frequently used GSA methods and inferred statistically and biologically more relevant regulatory pathways.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA. luo@cshl.edu

ABSTRACT

Background: Gene set analysis (GSA) is a widely used strategy for gene expression data analysis based on pathway knowledge. GSA focuses on sets of related genes and has established major advantages over individual gene analyses, including greater robustness, sensitivity and biological relevance. However, previous GSA methods have limited usage as they cannot handle datasets of different sample sizes or experimental designs.

Results: To address these limitations, we present a new GSA method called Generally Applicable Gene-set Enrichment (GAGE). We successfully apply GAGE to multiple microarray datasets with different sample sizes, experimental designs and profiling techniques. GAGE shows significantly better results when compared to two other commonly used GSA methods of GSEA and PAGE. We demonstrate this improvement in the following three aspects: (1) consistency across repeated studies/experiments; (2) sensitivity and specificity; (3) biological relevance of the regulatory mechanisms inferred.GAGE reveals novel and relevant regulatory mechanisms from both published and previously unpublished microarray studies. From two published lung cancer data sets, GAGE derived a more cohesive and predictive mechanistic scheme underlying lung cancer progress and metastasis. For a previously unpublished BMP6 study, GAGE predicted novel regulatory mechanisms for BMP6 induced osteoblast differentiation, including the canonical BMP-TGF beta signaling, JAK-STAT signaling, Wnt signaling, and estrogen signaling pathways-all of which are supported by the experimental literature.

Conclusion: GAGE is generally applicable to gene expression datasets with different sample sizes and experimental designs. GAGE consistently outperformed two most frequently used GSA methods and inferred statistically and biologically more relevant regulatory pathways. The GAGE method is implemented in R in the "gage" package, available under the GNU GPL from http://sysbio.engin.umich.edu/~luow/downloads.php.

Show MeSH
Related in: MedlinePlus