Limits...
Co-expression module analysis reveals biological processes, genomic gain, and regulatory mechanisms associated with breast cancer progression.

Shi Z, Derow CK, Zhang B - BMC Syst Biol (2010)

Bottom Line: Sixteen out of the 17 modules showed significant enrichment in certain Gene Ontology (GO) categories.IRF family and ETS family proteins were responsible for the up-regulation of the immune response modules.Moreover, inhibition of the PPARA signaling pathway may also play an important role in tumor progression.

View Article: PubMed Central - HTML - PubMed

Affiliation: Advanced Computing Center for Research & Education, Vanderbilt University, Nashville, TN 37240, USA.

ABSTRACT

Background: Gene expression signatures are typically identified by correlating gene expression patterns to a disease phenotype of interest. However, individual gene-based signatures usually suffer from low reproducibility and interpretability.

Results: We have developed a novel algorithm Iterative Clique Enumeration (ICE) for identifying relatively independent maximal cliques as co-expression modules and a module-based approach to the analysis of gene expression data. Applying this approach on a public breast cancer dataset identified 19 modules whose expression levels were significantly correlated with tumor grade. The correlations were reproducible for 17 modules in an independent breast cancer dataset, and the reproducibility was considerably higher than that based on individual genes or modules identified by other algorithms. Sixteen out of the 17 modules showed significant enrichment in certain Gene Ontology (GO) categories. Specifically, modules related to cell proliferation and immune response were up-regulated in high-grade tumors while those related to cell adhesion was down-regulated. Further analyses showed that transcription factors NYFB, E2F1/E2F3, NRF1, and ELK1 were responsible for the up-regulation of the cell proliferation modules. IRF family and ETS family proteins were responsible for the up-regulation of the immune response modules. Moreover, inhibition of the PPARA signaling pathway may also play an important role in tumor progression. The module without GO enrichment was found to be associated with a potential genomic gain in 8q21-23 in high-grade tumors. The 17-module signature of breast tumor progression clustered patients into subgroups with significantly different relapse-free survival times. Namely, patients with lower cell proliferation and higher cell adhesion levels had significantly lower risk of recurrence, both for all patients (p = 0.004) and for those with grade 2 tumors (p = 0.017).

Conclusions: The ICE algorithm is effective in identifying relatively independent co-expression modules from gene co-expression networks and the module-based approach illustrated in this study provides a robust, interpretable, and mechanistic characterization of transcriptional changes.

Show MeSH

Related in: MedlinePlus

Schematic overview of the co-expression module-based analysis framework. GO: Gene Ontology. TFBS: Transcription Factor Binding Sites.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2902438&req=5

Figure 1: Schematic overview of the co-expression module-based analysis framework. GO: Gene Ontology. TFBS: Transcription Factor Binding Sites.

Mentions: Figure 1 depicts an overview of the co-expression module-based analysis framework. Based on a gene expression data set, a co-expression network is constructed in which each node is a gene and two genes are connected by an edge if their expression similarity level is above a pre-selected threshold. Although we used the Pearson's correlation coefficient for the similarity calculation in this study, other measurements such as the Spearman's correlation coefficient and the mutual information can be equally applied. A knowledge-guided method is employed for threshold selection to ensure the biological relevance of the gene co-expression network. Next, the ICE algorithm developed in this study is used to identify relatively independent maximal cliques as co-expression modules. In contrast to the single gene-based analyses in which individual genes are tested for their correlation to a phenotype of interest (e.g. tumor grade or stage), the module-based approach analyzes modules as units and identifies co-expression modules that are significantly correlated with the phenotype, i.e. potential module biomarkers. Finally, identified modules are queried against gene set databases such as the GO gene sets and Transcription Factor Binding Site (TFBS) gene sets to infer biological processes and regulatory mechanisms underlying the phenotype of interest.


Co-expression module analysis reveals biological processes, genomic gain, and regulatory mechanisms associated with breast cancer progression.

Shi Z, Derow CK, Zhang B - BMC Syst Biol (2010)

Schematic overview of the co-expression module-based analysis framework. GO: Gene Ontology. TFBS: Transcription Factor Binding Sites.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2902438&req=5

Figure 1: Schematic overview of the co-expression module-based analysis framework. GO: Gene Ontology. TFBS: Transcription Factor Binding Sites.
Mentions: Figure 1 depicts an overview of the co-expression module-based analysis framework. Based on a gene expression data set, a co-expression network is constructed in which each node is a gene and two genes are connected by an edge if their expression similarity level is above a pre-selected threshold. Although we used the Pearson's correlation coefficient for the similarity calculation in this study, other measurements such as the Spearman's correlation coefficient and the mutual information can be equally applied. A knowledge-guided method is employed for threshold selection to ensure the biological relevance of the gene co-expression network. Next, the ICE algorithm developed in this study is used to identify relatively independent maximal cliques as co-expression modules. In contrast to the single gene-based analyses in which individual genes are tested for their correlation to a phenotype of interest (e.g. tumor grade or stage), the module-based approach analyzes modules as units and identifies co-expression modules that are significantly correlated with the phenotype, i.e. potential module biomarkers. Finally, identified modules are queried against gene set databases such as the GO gene sets and Transcription Factor Binding Site (TFBS) gene sets to infer biological processes and regulatory mechanisms underlying the phenotype of interest.

Bottom Line: Sixteen out of the 17 modules showed significant enrichment in certain Gene Ontology (GO) categories.IRF family and ETS family proteins were responsible for the up-regulation of the immune response modules.Moreover, inhibition of the PPARA signaling pathway may also play an important role in tumor progression.

View Article: PubMed Central - HTML - PubMed

Affiliation: Advanced Computing Center for Research & Education, Vanderbilt University, Nashville, TN 37240, USA.

ABSTRACT

Background: Gene expression signatures are typically identified by correlating gene expression patterns to a disease phenotype of interest. However, individual gene-based signatures usually suffer from low reproducibility and interpretability.

Results: We have developed a novel algorithm Iterative Clique Enumeration (ICE) for identifying relatively independent maximal cliques as co-expression modules and a module-based approach to the analysis of gene expression data. Applying this approach on a public breast cancer dataset identified 19 modules whose expression levels were significantly correlated with tumor grade. The correlations were reproducible for 17 modules in an independent breast cancer dataset, and the reproducibility was considerably higher than that based on individual genes or modules identified by other algorithms. Sixteen out of the 17 modules showed significant enrichment in certain Gene Ontology (GO) categories. Specifically, modules related to cell proliferation and immune response were up-regulated in high-grade tumors while those related to cell adhesion was down-regulated. Further analyses showed that transcription factors NYFB, E2F1/E2F3, NRF1, and ELK1 were responsible for the up-regulation of the cell proliferation modules. IRF family and ETS family proteins were responsible for the up-regulation of the immune response modules. Moreover, inhibition of the PPARA signaling pathway may also play an important role in tumor progression. The module without GO enrichment was found to be associated with a potential genomic gain in 8q21-23 in high-grade tumors. The 17-module signature of breast tumor progression clustered patients into subgroups with significantly different relapse-free survival times. Namely, patients with lower cell proliferation and higher cell adhesion levels had significantly lower risk of recurrence, both for all patients (p = 0.004) and for those with grade 2 tumors (p = 0.017).

Conclusions: The ICE algorithm is effective in identifying relatively independent co-expression modules from gene co-expression networks and the module-based approach illustrated in this study provides a robust, interpretable, and mechanistic characterization of transcriptional changes.

Show MeSH
Related in: MedlinePlus