Limits...
Integration of copy number and transcriptomics provides risk stratification in prostate cancer: A discovery and validation cohort study.

Ross-Adams H, Lamb AD, Dunning MJ, Halim S, Lindberg J, Massie CM, Egevad LA, Russell R, Ramos-Montoya A, Vowler SL, Sharma NL, Kay J, Whitaker H, Clark J, Hurst R, Gnanapragasam VJ, Shah NC, Warren AY, Cooper CS, Lynch AG, Stark R, Mills IG, Grönberg H, Neal DE, CamCaP Study Gro - EBioMedicine (2015)

Bottom Line: We show the relative contributions of gene expression and copy number data on phenotype, and demonstrate the improved power gained from integrative analyses.We confirm a number of previously published molecular changes associated with high risk disease, including MYC amplification, and NKX3-1, RB1 and PTEN deletions, as well as over-expression of PCA3 and AMACR, and loss of MSMB in tumour tissue.A subset of the 100 genes outperforms established clinical predictors of poor prognosis (PSA, Gleason score), as well as previously published gene signatures (p = 0.0001).

View Article: PubMed Central - PubMed

Affiliation: Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, UK.

ABSTRACT

Background: Understanding the heterogeneous genotypes and phenotypes of prostate cancer is fundamental to improving the way we treat this disease. As yet, there are no validated descriptions of prostate cancer subgroups derived from integrated genomics linked with clinical outcome.

Methods: In a study of 482 tumour, benign and germline samples from 259 men with primary prostate cancer, we used integrative analysis of copy number alterations (CNA) and array transcriptomics to identify genomic loci that affect expression levels of mRNA in an expression quantitative trait loci (eQTL) approach, to stratify patients into subgroups that we then associated with future clinical behaviour, and compared with either CNA or transcriptomics alone.

Findings: We identified five separate patient subgroups with distinct genomic alterations and expression profiles based on 100 discriminating genes in our separate discovery and validation sets of 125 and 103 men. These subgroups were able to consistently predict biochemical relapse (p = 0.0017 and p = 0.016 respectively) and were further validated in a third cohort with long-term follow-up (p = 0.027). We show the relative contributions of gene expression and copy number data on phenotype, and demonstrate the improved power gained from integrative analyses. We confirm alterations in six genes previously associated with prostate cancer (MAP3K7, MELK, RCBTB2, ELAC2, TPD52, ZBTB4), and also identify 94 genes not previously linked to prostate cancer progression that would not have been detected using either transcript or copy number data alone. We confirm a number of previously published molecular changes associated with high risk disease, including MYC amplification, and NKX3-1, RB1 and PTEN deletions, as well as over-expression of PCA3 and AMACR, and loss of MSMB in tumour tissue. A subset of the 100 genes outperforms established clinical predictors of poor prognosis (PSA, Gleason score), as well as previously published gene signatures (p = 0.0001). We further show how our molecular profiles can be used for the early detection of aggressive cases in a clinical setting, and inform treatment decisions.

Interpretation: For the first time in prostate cancer this study demonstrates the importance of integrated genomic analyses incorporating both benign and tumour tissue data in identifying molecular alterations leading to the generation of robust gene sets that are predictive of clinical outcome in independent patient cohorts.

No MeSH data available.


Related in: MedlinePlus

Integrative subgroups have characteristic molecular profiles.Genome-wide frequencies of somatic copy number alterations (CNAs) presented as a percentage of samples (left y-axis) in each integrated Cluster (iCluster). Regions of copy number gain are indicated in red and regions of loss in blue. Subgroups were identified by integrated hierarchical clustering (as described in Methods) of the discovery cohort (n = 125). For the validation cohort (n = 103), men were allocated to these same clusters as described (see Suppl. Fig. 6). Differentially expressed genes (DEG) are superimposed for each cluster; only genes with log2 fold change > 1.5 or < − 1.5 are shown (tumour versus matched benign; right y-axis). The top ten strongest DEGs in each cluster are annotated (see Suppl. Table 8 for full list).
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4588396&req=5

f0010: Integrative subgroups have characteristic molecular profiles.Genome-wide frequencies of somatic copy number alterations (CNAs) presented as a percentage of samples (left y-axis) in each integrated Cluster (iCluster). Regions of copy number gain are indicated in red and regions of loss in blue. Subgroups were identified by integrated hierarchical clustering (as described in Methods) of the discovery cohort (n = 125). For the validation cohort (n = 103), men were allocated to these same clusters as described (see Suppl. Fig. 6). Differentially expressed genes (DEG) are superimposed for each cluster; only genes with log2 fold change > 1.5 or < − 1.5 are shown (tumour versus matched benign; right y-axis). The top ten strongest DEGs in each cluster are annotated (see Suppl. Table 8 for full list).

Mentions: These eQTL features were used in a joint latent variable framework for integrative analysis (iClusterPlus (Mo et al., 2013); see Methods), which identified five distinct molecular subtypes (iCluster1–5) in the Cambridge cohort with characteristic copy number and gene expression profiles (Fig. 2). These were driven by a core set of 100 genes that had both CN and mRNA level changes (Suppl. Table 6). We confirmed this by comparing the results for alternative numbers of clusters (2–11) and features (100 to 1000) (see Suppl. Fig. 5A–C; Suppl. methods). These five clusters (k = 4; 100 features) describe 60% of the total observed variance (Suppl. Fig. 5A). These same 100 gene features were used to train a classifier, and partition the Stockholm data set into five patient subtypes with characteristic profiles (Suppl. Fig. 6), similar to those described in the discovery cohort.


Integration of copy number and transcriptomics provides risk stratification in prostate cancer: A discovery and validation cohort study.

Ross-Adams H, Lamb AD, Dunning MJ, Halim S, Lindberg J, Massie CM, Egevad LA, Russell R, Ramos-Montoya A, Vowler SL, Sharma NL, Kay J, Whitaker H, Clark J, Hurst R, Gnanapragasam VJ, Shah NC, Warren AY, Cooper CS, Lynch AG, Stark R, Mills IG, Grönberg H, Neal DE, CamCaP Study Gro - EBioMedicine (2015)

Integrative subgroups have characteristic molecular profiles.Genome-wide frequencies of somatic copy number alterations (CNAs) presented as a percentage of samples (left y-axis) in each integrated Cluster (iCluster). Regions of copy number gain are indicated in red and regions of loss in blue. Subgroups were identified by integrated hierarchical clustering (as described in Methods) of the discovery cohort (n = 125). For the validation cohort (n = 103), men were allocated to these same clusters as described (see Suppl. Fig. 6). Differentially expressed genes (DEG) are superimposed for each cluster; only genes with log2 fold change > 1.5 or < − 1.5 are shown (tumour versus matched benign; right y-axis). The top ten strongest DEGs in each cluster are annotated (see Suppl. Table 8 for full list).
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4588396&req=5

f0010: Integrative subgroups have characteristic molecular profiles.Genome-wide frequencies of somatic copy number alterations (CNAs) presented as a percentage of samples (left y-axis) in each integrated Cluster (iCluster). Regions of copy number gain are indicated in red and regions of loss in blue. Subgroups were identified by integrated hierarchical clustering (as described in Methods) of the discovery cohort (n = 125). For the validation cohort (n = 103), men were allocated to these same clusters as described (see Suppl. Fig. 6). Differentially expressed genes (DEG) are superimposed for each cluster; only genes with log2 fold change > 1.5 or < − 1.5 are shown (tumour versus matched benign; right y-axis). The top ten strongest DEGs in each cluster are annotated (see Suppl. Table 8 for full list).
Mentions: These eQTL features were used in a joint latent variable framework for integrative analysis (iClusterPlus (Mo et al., 2013); see Methods), which identified five distinct molecular subtypes (iCluster1–5) in the Cambridge cohort with characteristic copy number and gene expression profiles (Fig. 2). These were driven by a core set of 100 genes that had both CN and mRNA level changes (Suppl. Table 6). We confirmed this by comparing the results for alternative numbers of clusters (2–11) and features (100 to 1000) (see Suppl. Fig. 5A–C; Suppl. methods). These five clusters (k = 4; 100 features) describe 60% of the total observed variance (Suppl. Fig. 5A). These same 100 gene features were used to train a classifier, and partition the Stockholm data set into five patient subtypes with characteristic profiles (Suppl. Fig. 6), similar to those described in the discovery cohort.

Bottom Line: We show the relative contributions of gene expression and copy number data on phenotype, and demonstrate the improved power gained from integrative analyses.We confirm a number of previously published molecular changes associated with high risk disease, including MYC amplification, and NKX3-1, RB1 and PTEN deletions, as well as over-expression of PCA3 and AMACR, and loss of MSMB in tumour tissue.A subset of the 100 genes outperforms established clinical predictors of poor prognosis (PSA, Gleason score), as well as previously published gene signatures (p = 0.0001).

View Article: PubMed Central - PubMed

Affiliation: Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, UK.

ABSTRACT

Background: Understanding the heterogeneous genotypes and phenotypes of prostate cancer is fundamental to improving the way we treat this disease. As yet, there are no validated descriptions of prostate cancer subgroups derived from integrated genomics linked with clinical outcome.

Methods: In a study of 482 tumour, benign and germline samples from 259 men with primary prostate cancer, we used integrative analysis of copy number alterations (CNA) and array transcriptomics to identify genomic loci that affect expression levels of mRNA in an expression quantitative trait loci (eQTL) approach, to stratify patients into subgroups that we then associated with future clinical behaviour, and compared with either CNA or transcriptomics alone.

Findings: We identified five separate patient subgroups with distinct genomic alterations and expression profiles based on 100 discriminating genes in our separate discovery and validation sets of 125 and 103 men. These subgroups were able to consistently predict biochemical relapse (p = 0.0017 and p = 0.016 respectively) and were further validated in a third cohort with long-term follow-up (p = 0.027). We show the relative contributions of gene expression and copy number data on phenotype, and demonstrate the improved power gained from integrative analyses. We confirm alterations in six genes previously associated with prostate cancer (MAP3K7, MELK, RCBTB2, ELAC2, TPD52, ZBTB4), and also identify 94 genes not previously linked to prostate cancer progression that would not have been detected using either transcript or copy number data alone. We confirm a number of previously published molecular changes associated with high risk disease, including MYC amplification, and NKX3-1, RB1 and PTEN deletions, as well as over-expression of PCA3 and AMACR, and loss of MSMB in tumour tissue. A subset of the 100 genes outperforms established clinical predictors of poor prognosis (PSA, Gleason score), as well as previously published gene signatures (p = 0.0001). We further show how our molecular profiles can be used for the early detection of aggressive cases in a clinical setting, and inform treatment decisions.

Interpretation: For the first time in prostate cancer this study demonstrates the importance of integrated genomic analyses incorporating both benign and tumour tissue data in identifying molecular alterations leading to the generation of robust gene sets that are predictive of clinical outcome in independent patient cohorts.

No MeSH data available.


Related in: MedlinePlus