Limits...
Network modeling of the transcriptional effects of copy number aberrations in glioblastoma.

Jörnsten R, Abenius T, Kling T, Schmidt L, Johansson E, Nordling TE, Nordlander B, Sander C, Gennemark P, Funa K, Nilsson B, Lindahl L, Nelander S - Mol. Syst. Biol. (2011)

Bottom Line: Prognostic scores are obtained from a singular value decomposition of the networks.Targeted validations in four glioblastoma cell lines support selected predictions, and implicate the p53-interacting protein Necdin in suppressing glioblastoma cell growth.Free software in MATLAB and R is provided.

View Article: PubMed Central - PubMed

Affiliation: Mathematical Sciences, University of Gothenburg and Chalmers University of Technology, Gothenburg, Sweden.

ABSTRACT
DNA copy number aberrations (CNAs) are a hallmark of cancer genomes. However, little is known about how such changes affect global gene expression. We develop a modeling framework, EPoC (Endogenous Perturbation analysis of Cancer), to (1) detect disease-driving CNAs and their effect on target mRNA expression, and to (2) stratify cancer patients into long- and short-term survivors. Our method constructs causal network models of gene expression by combining genome-wide DNA- and RNA-level data. Prognostic scores are obtained from a singular value decomposition of the networks. By applying EPoC to glioblastoma data from The Cancer Genome Atlas consortium, we demonstrate that the resulting network models contain known disease-relevant hub genes, reveal interesting candidate hubs, and uncover predictors of patient survival. Targeted validations in four glioblastoma cell lines support selected predictions, and implicate the p53-interacting protein Necdin in suppressing glioblastoma cell growth. We conclude that large-scale network modeling of the effects of CNAs on gene expression may provide insights into the biology of human cancer. Free software in MATLAB and R is provided.

Show MeSH

Related in: MedlinePlus

Method comparisons: network consistency and pathway interactions. (A) We compare network models derived from two full replicate glioblastoma data sets (146 identical tumors; same patients and samples) but processed at different centers with slightly different technological setups (Affymetrix and Agilent technologies, run at MSKCC, Harvard Medical School and Broad Institute, Materials and methods). This test measures each method's reliability, i.e., its robustness to noise and technological factors. EPoC estimation of the CNA-driven network G is the best-performing method on the TCGA data (1−W lower, arrow ↗). Glasso is second best, followed by sparse estimation of the transcriptional network A (EPoC A), and remMap. LirNet, eQTL, GeneNet and ARACNE all exhibit less robust performance compared with EPoC G. (B) We map interactions found by EPoC and other methods to molecular links in the pathway repositories HPRD, Reactome, Intact and NCI-nature. Each interaction is characterized by the number of steps minimally needed to ‘walk' between the network gene and its target (i.e., the shortest path). We argue that a well-estimated network should be comprised of identified interactions that either match known interactions in the databases or are enriched for shorter paths. The figure depicts the enrichment (relative proportion of interactions that correspond to a shortest path length of 1 or 2 interactions in a pooled network based on the four different pathway databases). EPoC G interactions are clearly enriched for short or direct paths in the databases, followed by glasso and EPoC A.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3101951&req=5

f6: Method comparisons: network consistency and pathway interactions. (A) We compare network models derived from two full replicate glioblastoma data sets (146 identical tumors; same patients and samples) but processed at different centers with slightly different technological setups (Affymetrix and Agilent technologies, run at MSKCC, Harvard Medical School and Broad Institute, Materials and methods). This test measures each method's reliability, i.e., its robustness to noise and technological factors. EPoC estimation of the CNA-driven network G is the best-performing method on the TCGA data (1−W lower, arrow ↗). Glasso is second best, followed by sparse estimation of the transcriptional network A (EPoC A), and remMap. LirNet, eQTL, GeneNet and ARACNE all exhibit less robust performance compared with EPoC G. (B) We map interactions found by EPoC and other methods to molecular links in the pathway repositories HPRD, Reactome, Intact and NCI-nature. Each interaction is characterized by the number of steps minimally needed to ‘walk' between the network gene and its target (i.e., the shortest path). We argue that a well-estimated network should be comprised of identified interactions that either match known interactions in the databases or are enriched for shorter paths. The figure depicts the enrichment (relative proportion of interactions that correspond to a shortest path length of 1 or 2 interactions in a pooled network based on the four different pathway databases). EPoC G interactions are clearly enriched for short or direct paths in the databases, followed by glasso and EPoC A.

Mentions: We identify the subset of 146 patients (out of the 186 patients analyzed above), for which two independent CNA and mRNA data sets have been produced at different institutes in the TCGA consortium. These technically independent data sets provide an ideal setting for an unbiased comparison of the methods. We thus apply each method to the two data sets, and use Kendall's W to investigate the consistency between the two solutions (Materials and methods). This analysis shows stronger performance by EPoC CNA-driven networks G over all other methods for all but the largest network sizes (Figure 6A), i.e., EPoC G network solutions from two technically independent data sets largely agree both in terms of detection and estimated strength of network interactions.


Network modeling of the transcriptional effects of copy number aberrations in glioblastoma.

Jörnsten R, Abenius T, Kling T, Schmidt L, Johansson E, Nordling TE, Nordlander B, Sander C, Gennemark P, Funa K, Nilsson B, Lindahl L, Nelander S - Mol. Syst. Biol. (2011)

Method comparisons: network consistency and pathway interactions. (A) We compare network models derived from two full replicate glioblastoma data sets (146 identical tumors; same patients and samples) but processed at different centers with slightly different technological setups (Affymetrix and Agilent technologies, run at MSKCC, Harvard Medical School and Broad Institute, Materials and methods). This test measures each method's reliability, i.e., its robustness to noise and technological factors. EPoC estimation of the CNA-driven network G is the best-performing method on the TCGA data (1−W lower, arrow ↗). Glasso is second best, followed by sparse estimation of the transcriptional network A (EPoC A), and remMap. LirNet, eQTL, GeneNet and ARACNE all exhibit less robust performance compared with EPoC G. (B) We map interactions found by EPoC and other methods to molecular links in the pathway repositories HPRD, Reactome, Intact and NCI-nature. Each interaction is characterized by the number of steps minimally needed to ‘walk' between the network gene and its target (i.e., the shortest path). We argue that a well-estimated network should be comprised of identified interactions that either match known interactions in the databases or are enriched for shorter paths. The figure depicts the enrichment (relative proportion of interactions that correspond to a shortest path length of 1 or 2 interactions in a pooled network based on the four different pathway databases). EPoC G interactions are clearly enriched for short or direct paths in the databases, followed by glasso and EPoC A.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3101951&req=5

f6: Method comparisons: network consistency and pathway interactions. (A) We compare network models derived from two full replicate glioblastoma data sets (146 identical tumors; same patients and samples) but processed at different centers with slightly different technological setups (Affymetrix and Agilent technologies, run at MSKCC, Harvard Medical School and Broad Institute, Materials and methods). This test measures each method's reliability, i.e., its robustness to noise and technological factors. EPoC estimation of the CNA-driven network G is the best-performing method on the TCGA data (1−W lower, arrow ↗). Glasso is second best, followed by sparse estimation of the transcriptional network A (EPoC A), and remMap. LirNet, eQTL, GeneNet and ARACNE all exhibit less robust performance compared with EPoC G. (B) We map interactions found by EPoC and other methods to molecular links in the pathway repositories HPRD, Reactome, Intact and NCI-nature. Each interaction is characterized by the number of steps minimally needed to ‘walk' between the network gene and its target (i.e., the shortest path). We argue that a well-estimated network should be comprised of identified interactions that either match known interactions in the databases or are enriched for shorter paths. The figure depicts the enrichment (relative proportion of interactions that correspond to a shortest path length of 1 or 2 interactions in a pooled network based on the four different pathway databases). EPoC G interactions are clearly enriched for short or direct paths in the databases, followed by glasso and EPoC A.
Mentions: We identify the subset of 146 patients (out of the 186 patients analyzed above), for which two independent CNA and mRNA data sets have been produced at different institutes in the TCGA consortium. These technically independent data sets provide an ideal setting for an unbiased comparison of the methods. We thus apply each method to the two data sets, and use Kendall's W to investigate the consistency between the two solutions (Materials and methods). This analysis shows stronger performance by EPoC CNA-driven networks G over all other methods for all but the largest network sizes (Figure 6A), i.e., EPoC G network solutions from two technically independent data sets largely agree both in terms of detection and estimated strength of network interactions.

Bottom Line: Prognostic scores are obtained from a singular value decomposition of the networks.Targeted validations in four glioblastoma cell lines support selected predictions, and implicate the p53-interacting protein Necdin in suppressing glioblastoma cell growth.Free software in MATLAB and R is provided.

View Article: PubMed Central - PubMed

Affiliation: Mathematical Sciences, University of Gothenburg and Chalmers University of Technology, Gothenburg, Sweden.

ABSTRACT
DNA copy number aberrations (CNAs) are a hallmark of cancer genomes. However, little is known about how such changes affect global gene expression. We develop a modeling framework, EPoC (Endogenous Perturbation analysis of Cancer), to (1) detect disease-driving CNAs and their effect on target mRNA expression, and to (2) stratify cancer patients into long- and short-term survivors. Our method constructs causal network models of gene expression by combining genome-wide DNA- and RNA-level data. Prognostic scores are obtained from a singular value decomposition of the networks. By applying EPoC to glioblastoma data from The Cancer Genome Atlas consortium, we demonstrate that the resulting network models contain known disease-relevant hub genes, reveal interesting candidate hubs, and uncover predictors of patient survival. Targeted validations in four glioblastoma cell lines support selected predictions, and implicate the p53-interacting protein Necdin in suppressing glioblastoma cell growth. We conclude that large-scale network modeling of the effects of CNAs on gene expression may provide insights into the biology of human cancer. Free software in MATLAB and R is provided.

Show MeSH
Related in: MedlinePlus