Limits...
Integration of copy number and transcriptomics provides risk stratification in prostate cancer: A discovery and validation cohort study.

Ross-Adams H, Lamb AD, Dunning MJ, Halim S, Lindberg J, Massie CM, Egevad LA, Russell R, Ramos-Montoya A, Vowler SL, Sharma NL, Kay J, Whitaker H, Clark J, Hurst R, Gnanapragasam VJ, Shah NC, Warren AY, Cooper CS, Lynch AG, Stark R, Mills IG, Grönberg H, Neal DE, CamCaP Study Gro - EBioMedicine (2015)

Bottom Line: We show the relative contributions of gene expression and copy number data on phenotype, and demonstrate the improved power gained from integrative analyses.We confirm a number of previously published molecular changes associated with high risk disease, including MYC amplification, and NKX3-1, RB1 and PTEN deletions, as well as over-expression of PCA3 and AMACR, and loss of MSMB in tumour tissue.A subset of the 100 genes outperforms established clinical predictors of poor prognosis (PSA, Gleason score), as well as previously published gene signatures (p = 0.0001).

View Article: PubMed Central - PubMed

Affiliation: Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, UK.

ABSTRACT

Background: Understanding the heterogeneous genotypes and phenotypes of prostate cancer is fundamental to improving the way we treat this disease. As yet, there are no validated descriptions of prostate cancer subgroups derived from integrated genomics linked with clinical outcome.

Methods: In a study of 482 tumour, benign and germline samples from 259 men with primary prostate cancer, we used integrative analysis of copy number alterations (CNA) and array transcriptomics to identify genomic loci that affect expression levels of mRNA in an expression quantitative trait loci (eQTL) approach, to stratify patients into subgroups that we then associated with future clinical behaviour, and compared with either CNA or transcriptomics alone.

Findings: We identified five separate patient subgroups with distinct genomic alterations and expression profiles based on 100 discriminating genes in our separate discovery and validation sets of 125 and 103 men. These subgroups were able to consistently predict biochemical relapse (p = 0.0017 and p = 0.016 respectively) and were further validated in a third cohort with long-term follow-up (p = 0.027). We show the relative contributions of gene expression and copy number data on phenotype, and demonstrate the improved power gained from integrative analyses. We confirm alterations in six genes previously associated with prostate cancer (MAP3K7, MELK, RCBTB2, ELAC2, TPD52, ZBTB4), and also identify 94 genes not previously linked to prostate cancer progression that would not have been detected using either transcript or copy number data alone. We confirm a number of previously published molecular changes associated with high risk disease, including MYC amplification, and NKX3-1, RB1 and PTEN deletions, as well as over-expression of PCA3 and AMACR, and loss of MSMB in tumour tissue. A subset of the 100 genes outperforms established clinical predictors of poor prognosis (PSA, Gleason score), as well as previously published gene signatures (p = 0.0001). We further show how our molecular profiles can be used for the early detection of aggressive cases in a clinical setting, and inform treatment decisions.

Interpretation: For the first time in prostate cancer this study demonstrates the importance of integrated genomic analyses incorporating both benign and tumour tissue data in identifying molecular alterations leading to the generation of robust gene sets that are predictive of clinical outcome in independent patient cohorts.

No MeSH data available.


Related in: MedlinePlus

Integrative subgroups have distinct clinical outcomes and are powerful predictors of relapse.A. Kaplan–Meier plot of relapse-free survival over 60 months for the five molecular subtypes in the Cambridge discovery cohort (p = 0.0017 for the two highest versus two lowest risk groups). For each cluster, the total number of samples is indicated (total relapses in brackets).B. Kaplan–Meier plot of relapse-free survival over 96 months in the Stockholm validation cohort (p = 0.016). Further validation was undertaken in a third dataset (Taylor et al. (2010); Suppl. Fig. 9).C. Distribution of Gleason grade across subtypes (Cambridge discovery cohort); no Gleason score predominates in any one subtype (Kruskal–Wallis p = 0.6194).D. Cox proportional hazard ratios with 95% confidence intervals for high vs low Gleason score (≥ 4 + 3 = 7 vs ≤ 3 + 4 = 7), and every other integrative cluster vs best prognosis cluster4. Cambridge and Stockholm datasets were combined to ensure sufficient events per variable (biochemical relapses per cluster) for robust statistical testing (Peduzzi et al., 1995). Confidence intervals shown are 0.9, 0.95 and 0.99.E&F. Refined 100-gene set tested for power to predict relapse in the Stockholm validation set against 1000 random signatures (p < 0.001) and 189 oncological signatures (Subramanian et al., 2005; p < 0.001). Comparison was also made with other prostate cancer signatures (Suppl. Table 11).
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4588396&req=5

f0020: Integrative subgroups have distinct clinical outcomes and are powerful predictors of relapse.A. Kaplan–Meier plot of relapse-free survival over 60 months for the five molecular subtypes in the Cambridge discovery cohort (p = 0.0017 for the two highest versus two lowest risk groups). For each cluster, the total number of samples is indicated (total relapses in brackets).B. Kaplan–Meier plot of relapse-free survival over 96 months in the Stockholm validation cohort (p = 0.016). Further validation was undertaken in a third dataset (Taylor et al. (2010); Suppl. Fig. 9).C. Distribution of Gleason grade across subtypes (Cambridge discovery cohort); no Gleason score predominates in any one subtype (Kruskal–Wallis p = 0.6194).D. Cox proportional hazard ratios with 95% confidence intervals for high vs low Gleason score (≥ 4 + 3 = 7 vs ≤ 3 + 4 = 7), and every other integrative cluster vs best prognosis cluster4. Cambridge and Stockholm datasets were combined to ensure sufficient events per variable (biochemical relapses per cluster) for robust statistical testing (Peduzzi et al., 1995). Confidence intervals shown are 0.9, 0.95 and 0.99.E&F. Refined 100-gene set tested for power to predict relapse in the Stockholm validation set against 1000 random signatures (p < 0.001) and 189 oncological signatures (Subramanian et al., 2005; p < 0.001). Comparison was also made with other prostate cancer signatures (Suppl. Table 11).

Mentions: Finally, we considered the sample groups identified by our integrative analysis (Fig. 4A) as ‘true’ clusters with clinical relevance, and compared these ‘true’ clusters to the sample groupings suggested by either copy number (Suppl. Fig. 2) or gene expression data alone (Suppl. Fig. 3). We used two different approaches to determine the similarity of the alternative clustering methods to the ‘true’ clusters. Based on both the Adjusted Rand Index (ARI) (Hubert and Arabie, 1985) and the Variation of Information Index (VII) (Meilă, 2007), sample clustering based on CN-data is more similar to integrative (‘true’) clustering than is expression-based clustering (Suppl. Table 7; Suppl. methods).


Integration of copy number and transcriptomics provides risk stratification in prostate cancer: A discovery and validation cohort study.

Ross-Adams H, Lamb AD, Dunning MJ, Halim S, Lindberg J, Massie CM, Egevad LA, Russell R, Ramos-Montoya A, Vowler SL, Sharma NL, Kay J, Whitaker H, Clark J, Hurst R, Gnanapragasam VJ, Shah NC, Warren AY, Cooper CS, Lynch AG, Stark R, Mills IG, Grönberg H, Neal DE, CamCaP Study Gro - EBioMedicine (2015)

Integrative subgroups have distinct clinical outcomes and are powerful predictors of relapse.A. Kaplan–Meier plot of relapse-free survival over 60 months for the five molecular subtypes in the Cambridge discovery cohort (p = 0.0017 for the two highest versus two lowest risk groups). For each cluster, the total number of samples is indicated (total relapses in brackets).B. Kaplan–Meier plot of relapse-free survival over 96 months in the Stockholm validation cohort (p = 0.016). Further validation was undertaken in a third dataset (Taylor et al. (2010); Suppl. Fig. 9).C. Distribution of Gleason grade across subtypes (Cambridge discovery cohort); no Gleason score predominates in any one subtype (Kruskal–Wallis p = 0.6194).D. Cox proportional hazard ratios with 95% confidence intervals for high vs low Gleason score (≥ 4 + 3 = 7 vs ≤ 3 + 4 = 7), and every other integrative cluster vs best prognosis cluster4. Cambridge and Stockholm datasets were combined to ensure sufficient events per variable (biochemical relapses per cluster) for robust statistical testing (Peduzzi et al., 1995). Confidence intervals shown are 0.9, 0.95 and 0.99.E&F. Refined 100-gene set tested for power to predict relapse in the Stockholm validation set against 1000 random signatures (p < 0.001) and 189 oncological signatures (Subramanian et al., 2005; p < 0.001). Comparison was also made with other prostate cancer signatures (Suppl. Table 11).
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4588396&req=5

f0020: Integrative subgroups have distinct clinical outcomes and are powerful predictors of relapse.A. Kaplan–Meier plot of relapse-free survival over 60 months for the five molecular subtypes in the Cambridge discovery cohort (p = 0.0017 for the two highest versus two lowest risk groups). For each cluster, the total number of samples is indicated (total relapses in brackets).B. Kaplan–Meier plot of relapse-free survival over 96 months in the Stockholm validation cohort (p = 0.016). Further validation was undertaken in a third dataset (Taylor et al. (2010); Suppl. Fig. 9).C. Distribution of Gleason grade across subtypes (Cambridge discovery cohort); no Gleason score predominates in any one subtype (Kruskal–Wallis p = 0.6194).D. Cox proportional hazard ratios with 95% confidence intervals for high vs low Gleason score (≥ 4 + 3 = 7 vs ≤ 3 + 4 = 7), and every other integrative cluster vs best prognosis cluster4. Cambridge and Stockholm datasets were combined to ensure sufficient events per variable (biochemical relapses per cluster) for robust statistical testing (Peduzzi et al., 1995). Confidence intervals shown are 0.9, 0.95 and 0.99.E&F. Refined 100-gene set tested for power to predict relapse in the Stockholm validation set against 1000 random signatures (p < 0.001) and 189 oncological signatures (Subramanian et al., 2005; p < 0.001). Comparison was also made with other prostate cancer signatures (Suppl. Table 11).
Mentions: Finally, we considered the sample groups identified by our integrative analysis (Fig. 4A) as ‘true’ clusters with clinical relevance, and compared these ‘true’ clusters to the sample groupings suggested by either copy number (Suppl. Fig. 2) or gene expression data alone (Suppl. Fig. 3). We used two different approaches to determine the similarity of the alternative clustering methods to the ‘true’ clusters. Based on both the Adjusted Rand Index (ARI) (Hubert and Arabie, 1985) and the Variation of Information Index (VII) (Meilă, 2007), sample clustering based on CN-data is more similar to integrative (‘true’) clustering than is expression-based clustering (Suppl. Table 7; Suppl. methods).

Bottom Line: We show the relative contributions of gene expression and copy number data on phenotype, and demonstrate the improved power gained from integrative analyses.We confirm a number of previously published molecular changes associated with high risk disease, including MYC amplification, and NKX3-1, RB1 and PTEN deletions, as well as over-expression of PCA3 and AMACR, and loss of MSMB in tumour tissue.A subset of the 100 genes outperforms established clinical predictors of poor prognosis (PSA, Gleason score), as well as previously published gene signatures (p = 0.0001).

View Article: PubMed Central - PubMed

Affiliation: Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, UK.

ABSTRACT

Background: Understanding the heterogeneous genotypes and phenotypes of prostate cancer is fundamental to improving the way we treat this disease. As yet, there are no validated descriptions of prostate cancer subgroups derived from integrated genomics linked with clinical outcome.

Methods: In a study of 482 tumour, benign and germline samples from 259 men with primary prostate cancer, we used integrative analysis of copy number alterations (CNA) and array transcriptomics to identify genomic loci that affect expression levels of mRNA in an expression quantitative trait loci (eQTL) approach, to stratify patients into subgroups that we then associated with future clinical behaviour, and compared with either CNA or transcriptomics alone.

Findings: We identified five separate patient subgroups with distinct genomic alterations and expression profiles based on 100 discriminating genes in our separate discovery and validation sets of 125 and 103 men. These subgroups were able to consistently predict biochemical relapse (p = 0.0017 and p = 0.016 respectively) and were further validated in a third cohort with long-term follow-up (p = 0.027). We show the relative contributions of gene expression and copy number data on phenotype, and demonstrate the improved power gained from integrative analyses. We confirm alterations in six genes previously associated with prostate cancer (MAP3K7, MELK, RCBTB2, ELAC2, TPD52, ZBTB4), and also identify 94 genes not previously linked to prostate cancer progression that would not have been detected using either transcript or copy number data alone. We confirm a number of previously published molecular changes associated with high risk disease, including MYC amplification, and NKX3-1, RB1 and PTEN deletions, as well as over-expression of PCA3 and AMACR, and loss of MSMB in tumour tissue. A subset of the 100 genes outperforms established clinical predictors of poor prognosis (PSA, Gleason score), as well as previously published gene signatures (p = 0.0001). We further show how our molecular profiles can be used for the early detection of aggressive cases in a clinical setting, and inform treatment decisions.

Interpretation: For the first time in prostate cancer this study demonstrates the importance of integrated genomic analyses incorporating both benign and tumour tissue data in identifying molecular alterations leading to the generation of robust gene sets that are predictive of clinical outcome in independent patient cohorts.

No MeSH data available.


Related in: MedlinePlus