Limits...
A pan-cancer proteomic perspective on The Cancer Genome Atlas.

Akbani R, Ng PK, Werner HM, Shahmoradgoli M, Zhang F, Ju Z, Liu W, Yang JY, Yoshihara K, Li J, Ling S, Seviour EG, Ram PT, Minna JD, Diao L, Tong P, Heymach JV, Hill SM, Dondelinger F, Städler N, Byers LA, Meric-Bernstam F, Weinstein JN, Broom BM, Verhaak RG, Liang H, Mukherjee S, Lu Y, Mills GB - Nat Commun (2014)

Bottom Line: Therefore, direct study of the functional proteome has the potential to provide a wealth of information that complements and extends genomic, epigenomic and transcriptomic analysis in The Cancer Genome Atlas (TCGA) projects.The resultant proteomic data are integrated with genomic and transcriptomic analyses of the same samples to identify commonalities, differences, emergent pathways and network biology within and across tumour lineages.In addition, tissue-specific signals are reduced computationally to enhance biomarker and target discovery spanning multiple tumour lineages.

View Article: PubMed Central - PubMed

Affiliation: 1] Department of Bioinformatics and Computational Biology, 1400 Pressler St., The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA [2].

ABSTRACT
Protein levels and function are poorly predicted by genomic and transcriptomic analysis of patient tumours. Therefore, direct study of the functional proteome has the potential to provide a wealth of information that complements and extends genomic, epigenomic and transcriptomic analysis in The Cancer Genome Atlas (TCGA) projects. Here we use reverse-phase protein arrays to analyse 3,467 patient samples from 11 TCGA 'Pan-Cancer' diseases, using 181 high-quality antibodies that target 128 total proteins and 53 post-translationally modified proteins. The resultant proteomic data are integrated with genomic and transcriptomic analyses of the same samples to identify commonalities, differences, emergent pathways and network biology within and across tumour lineages. In addition, tissue-specific signals are reduced computationally to enhance biomarker and target discovery spanning multiple tumour lineages. This integrative analysis, with an emphasis on pathways and potentially actionable proteins, provides a framework for determining the prognostic, predictive and therapeutic relevance of the functional proteome.

Show MeSH

Related in: MedlinePlus

Unsupervised clustering and analyses based on the RBN dataseta Heatmap depicting protein levels after unsupervised hierarchical clustering of the RBN dataset consisting of 3,467 cancer samples across 11 tumor types and 181 antibodies. Protein levels are indicated on a low-to-high scale (blue-white-red). Eight clusters are defined. Cluster_A has been subdivided into two clusters (A1 and A2), based on the differences between BRCA reactive and remaining luminal subtypes5. Annotation bars include tumor type (BRCA-basal separately indicated); purity and ploidy (ABSOLUTE algorithm); stromal and immune scores (ESTIMATE algorithm); BRCA (PAM50 classification) and BLCA subtype; 16 significantly mutated genes and two frequently observed amplifications. The statistical significance of correlations between the clusters and each variable is indicated to the left of each annotation bar (n=3,467, chi-squared, Fisher’s Exact, and ANOVA’s F test. See Methods).b Crosstab showing the number of tumor samples in each cluster.c-e Kaplan Meier curves showing overall survival of (c) the BRCA located in four separate clusters (A1, A2, E and F, n=740), (d) KIRC in cluster_F vs. KIRC in other clusters (n=454) and (e) BLCA in cluster_B vs. BLCA in other clusters (n=127). Follow-up was capped at 60 months due to limited number of events beyond this time. Statistical difference in outcome between groups is indicated by P-value (log-rank test). A high-resolution, interactive version of the heatmap with zooming capability, can be found at (http://bioinformatics.mdanderson.org/main/TCGA/Pancan11/RPPA).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4109726&req=5

Figure 2: Unsupervised clustering and analyses based on the RBN dataseta Heatmap depicting protein levels after unsupervised hierarchical clustering of the RBN dataset consisting of 3,467 cancer samples across 11 tumor types and 181 antibodies. Protein levels are indicated on a low-to-high scale (blue-white-red). Eight clusters are defined. Cluster_A has been subdivided into two clusters (A1 and A2), based on the differences between BRCA reactive and remaining luminal subtypes5. Annotation bars include tumor type (BRCA-basal separately indicated); purity and ploidy (ABSOLUTE algorithm); stromal and immune scores (ESTIMATE algorithm); BRCA (PAM50 classification) and BLCA subtype; 16 significantly mutated genes and two frequently observed amplifications. The statistical significance of correlations between the clusters and each variable is indicated to the left of each annotation bar (n=3,467, chi-squared, Fisher’s Exact, and ANOVA’s F test. See Methods).b Crosstab showing the number of tumor samples in each cluster.c-e Kaplan Meier curves showing overall survival of (c) the BRCA located in four separate clusters (A1, A2, E and F, n=740), (d) KIRC in cluster_F vs. KIRC in other clusters (n=454) and (e) BLCA in cluster_B vs. BLCA in other clusters (n=127). Follow-up was capped at 60 months due to limited number of events beyond this time. Statistical difference in outcome between groups is indicated by P-value (log-rank test). A high-resolution, interactive version of the heatmap with zooming capability, can be found at (http://bioinformatics.mdanderson.org/main/TCGA/Pancan11/RPPA).

Mentions: Unsupervised clustering identified eight robust clusters (Clusters A-H, Fig. 2a) when batch effects were mitigated by RBN. Not surprisingly, RBN cluster membership is defined primarily by tumor type with the exception of cluster_E and cluster_F, which include multiple diseases (Fig. 2b). Bladder cancer, however, did not generate a dominant cluster but, rather, was co-located with other tumor lineages in multiple clusters. To identify potential discriminators of clusters, we compared the ability of proteins, RNAs, miRNAs and mutations for each cluster to different samples from those in all other clusters (top 25 discriminators, Supplementary Tables 2-5, all the discriminators at http://bioinformatics.mdanderson.org/main/TCGA/Pancan11/RPPA). Supplementary Table 2 highlights the contribution of individual proteins in driving the different clusters. Associations of specific mutations and copy number changes with the clusters were primarily based on known associations of mutations and copy number changes with tumor lineage.4, 5, 6, 7, 8, 9, 10


A pan-cancer proteomic perspective on The Cancer Genome Atlas.

Akbani R, Ng PK, Werner HM, Shahmoradgoli M, Zhang F, Ju Z, Liu W, Yang JY, Yoshihara K, Li J, Ling S, Seviour EG, Ram PT, Minna JD, Diao L, Tong P, Heymach JV, Hill SM, Dondelinger F, Städler N, Byers LA, Meric-Bernstam F, Weinstein JN, Broom BM, Verhaak RG, Liang H, Mukherjee S, Lu Y, Mills GB - Nat Commun (2014)

Unsupervised clustering and analyses based on the RBN dataseta Heatmap depicting protein levels after unsupervised hierarchical clustering of the RBN dataset consisting of 3,467 cancer samples across 11 tumor types and 181 antibodies. Protein levels are indicated on a low-to-high scale (blue-white-red). Eight clusters are defined. Cluster_A has been subdivided into two clusters (A1 and A2), based on the differences between BRCA reactive and remaining luminal subtypes5. Annotation bars include tumor type (BRCA-basal separately indicated); purity and ploidy (ABSOLUTE algorithm); stromal and immune scores (ESTIMATE algorithm); BRCA (PAM50 classification) and BLCA subtype; 16 significantly mutated genes and two frequently observed amplifications. The statistical significance of correlations between the clusters and each variable is indicated to the left of each annotation bar (n=3,467, chi-squared, Fisher’s Exact, and ANOVA’s F test. See Methods).b Crosstab showing the number of tumor samples in each cluster.c-e Kaplan Meier curves showing overall survival of (c) the BRCA located in four separate clusters (A1, A2, E and F, n=740), (d) KIRC in cluster_F vs. KIRC in other clusters (n=454) and (e) BLCA in cluster_B vs. BLCA in other clusters (n=127). Follow-up was capped at 60 months due to limited number of events beyond this time. Statistical difference in outcome between groups is indicated by P-value (log-rank test). A high-resolution, interactive version of the heatmap with zooming capability, can be found at (http://bioinformatics.mdanderson.org/main/TCGA/Pancan11/RPPA).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4109726&req=5

Figure 2: Unsupervised clustering and analyses based on the RBN dataseta Heatmap depicting protein levels after unsupervised hierarchical clustering of the RBN dataset consisting of 3,467 cancer samples across 11 tumor types and 181 antibodies. Protein levels are indicated on a low-to-high scale (blue-white-red). Eight clusters are defined. Cluster_A has been subdivided into two clusters (A1 and A2), based on the differences between BRCA reactive and remaining luminal subtypes5. Annotation bars include tumor type (BRCA-basal separately indicated); purity and ploidy (ABSOLUTE algorithm); stromal and immune scores (ESTIMATE algorithm); BRCA (PAM50 classification) and BLCA subtype; 16 significantly mutated genes and two frequently observed amplifications. The statistical significance of correlations between the clusters and each variable is indicated to the left of each annotation bar (n=3,467, chi-squared, Fisher’s Exact, and ANOVA’s F test. See Methods).b Crosstab showing the number of tumor samples in each cluster.c-e Kaplan Meier curves showing overall survival of (c) the BRCA located in four separate clusters (A1, A2, E and F, n=740), (d) KIRC in cluster_F vs. KIRC in other clusters (n=454) and (e) BLCA in cluster_B vs. BLCA in other clusters (n=127). Follow-up was capped at 60 months due to limited number of events beyond this time. Statistical difference in outcome between groups is indicated by P-value (log-rank test). A high-resolution, interactive version of the heatmap with zooming capability, can be found at (http://bioinformatics.mdanderson.org/main/TCGA/Pancan11/RPPA).
Mentions: Unsupervised clustering identified eight robust clusters (Clusters A-H, Fig. 2a) when batch effects were mitigated by RBN. Not surprisingly, RBN cluster membership is defined primarily by tumor type with the exception of cluster_E and cluster_F, which include multiple diseases (Fig. 2b). Bladder cancer, however, did not generate a dominant cluster but, rather, was co-located with other tumor lineages in multiple clusters. To identify potential discriminators of clusters, we compared the ability of proteins, RNAs, miRNAs and mutations for each cluster to different samples from those in all other clusters (top 25 discriminators, Supplementary Tables 2-5, all the discriminators at http://bioinformatics.mdanderson.org/main/TCGA/Pancan11/RPPA). Supplementary Table 2 highlights the contribution of individual proteins in driving the different clusters. Associations of specific mutations and copy number changes with the clusters were primarily based on known associations of mutations and copy number changes with tumor lineage.4, 5, 6, 7, 8, 9, 10

Bottom Line: Therefore, direct study of the functional proteome has the potential to provide a wealth of information that complements and extends genomic, epigenomic and transcriptomic analysis in The Cancer Genome Atlas (TCGA) projects.The resultant proteomic data are integrated with genomic and transcriptomic analyses of the same samples to identify commonalities, differences, emergent pathways and network biology within and across tumour lineages.In addition, tissue-specific signals are reduced computationally to enhance biomarker and target discovery spanning multiple tumour lineages.

View Article: PubMed Central - PubMed

Affiliation: 1] Department of Bioinformatics and Computational Biology, 1400 Pressler St., The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA [2].

ABSTRACT
Protein levels and function are poorly predicted by genomic and transcriptomic analysis of patient tumours. Therefore, direct study of the functional proteome has the potential to provide a wealth of information that complements and extends genomic, epigenomic and transcriptomic analysis in The Cancer Genome Atlas (TCGA) projects. Here we use reverse-phase protein arrays to analyse 3,467 patient samples from 11 TCGA 'Pan-Cancer' diseases, using 181 high-quality antibodies that target 128 total proteins and 53 post-translationally modified proteins. The resultant proteomic data are integrated with genomic and transcriptomic analyses of the same samples to identify commonalities, differences, emergent pathways and network biology within and across tumour lineages. In addition, tissue-specific signals are reduced computationally to enhance biomarker and target discovery spanning multiple tumour lineages. This integrative analysis, with an emphasis on pathways and potentially actionable proteins, provides a framework for determining the prognostic, predictive and therapeutic relevance of the functional proteome.

Show MeSH
Related in: MedlinePlus