Limits...
Distinguishing between driver and passenger mutations in individual cancer genomes by network enrichment analysis.

Merid SK, Goranskaya D, Alexeyenko A - BMC Bioinformatics (2014)

Bottom Line: We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples.On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results.We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology, Tumour and Cell biology, Bioinformatics Infrastructure for Life Sciences, Science for Life Laboratory, Karolinska Institutet, 17177 Stockholm, Sweden. andrej.alekseenko@scilifelab.se.

ABSTRACT

Background: In somatic cancer genomes, delineating genuine driver mutations against a background of multiple passenger events is a challenging task. The difficulty of determining function from sequence data and the low frequency of mutations are increasingly hindering the search for novel, less common cancer drivers. The accumulation of extensive amounts of data on somatic point and copy number alterations necessitates the development of systematic methods for driver mutation analysis.

Results: We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples. It probabilistically evaluates 1) functional network links between different mutations in the same genome and 2) links between individual mutations and known cancer pathways. In addition, it can employ correlations of mutation patterns in pairs of genes. The method was used to analyze genomic alterations in two TCGA datasets, one for glioblastoma multiforme and another for ovarian carcinoma, which were generated using different approaches to mutation profiling. The proportions of drivers among the reported de novo point mutations in these cancers were estimated to be 57.8% and 16.8%, respectively. The both sets also included extended chromosomal regions with synchronous duplications or losses of multiple genes. We identified putative copy number driver events within many such segments. Finally, we summarized seemingly disparate mutations and discovered a functional network of collagen modifications in the glioblastoma. In order to select the most efficient network for use with this method, we used a novel, ROC curve-based procedure for benchmarking different network versions by their ability to recover pathway membership.

Conclusions: The results of our network-based procedure were in good agreement with published gold standard sets of cancer genes and were shown to complement and expand frequency-based driver analyses. On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results. We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations. The software used in this work and the global network of functional couplings are publicly available at http://research.scilifelab.se/andrej_alexeyenko/downloads.html.

Show MeSH

Related in: MedlinePlus

Driver analysis over extended chromosomal regions. All the components of NEA were run over copy-number altered regions of chromosome 7 in the ovarian cancer. The grey areas mask the chromosomal regions omitted from this plot, so that only four selected regions are shown. Multi-colored bars in the lower plot indicate the copy numbers in individual OV genomes relative to the reference diploid genome (dotted red line). Grey lines indicate the 10th, 25th, 50th, 75th, and 90th percentiles of copy number in the OV cohort. The upper plot shows the prioritization of genes based on the–log10 of their combined p-values: genes with higher positions are more highly prioritized. Genes found to be highly significant in the 1-vs-CPW analysis are highlighted in red, those highly significant in the 1CNA-vs-MGS analysis are marked with a black asterisk, and red asterisks indicate genes highly significant in both analyses. Gene symbols in brackets indicate point mutations that co-occurred with the driver CNAs. A complete version of this figure can be found in the Additional file 2 and Additional file 3: GBM.CNA_drivers_along_chromosomes.pdf and OV.CNA_drivers_along_chromosomes.pdf.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4262241&req=5

Fig4: Driver analysis over extended chromosomal regions. All the components of NEA were run over copy-number altered regions of chromosome 7 in the ovarian cancer. The grey areas mask the chromosomal regions omitted from this plot, so that only four selected regions are shown. Multi-colored bars in the lower plot indicate the copy numbers in individual OV genomes relative to the reference diploid genome (dotted red line). Grey lines indicate the 10th, 25th, 50th, 75th, and 90th percentiles of copy number in the OV cohort. The upper plot shows the prioritization of genes based on the–log10 of their combined p-values: genes with higher positions are more highly prioritized. Genes found to be highly significant in the 1-vs-CPW analysis are highlighted in red, those highly significant in the 1CNA-vs-MGS analysis are marked with a black asterisk, and red asterisks indicate genes highly significant in both analyses. Gene symbols in brackets indicate point mutations that co-occurred with the driver CNAs. A complete version of this figure can be found in the Additional file 2 and Additional file 3: GBM.CNA_drivers_along_chromosomes.pdf and OV.CNA_drivers_along_chromosomes.pdf.

Mentions: We visualized the results of all three tests and their combined results using chromosomal maps (Figure 4 and Additional file 2: GBM.CNA_drivers_along_chromosomes.pdf and Additional file 3: OV.CNA_drivers_along_chromosomes.pdf). Figure 4 shows the results of the analysis for chromosome 7 in the OV set. While the copy numbers varied along the chromosome’s length, there were only a few regions in which these variations significantly co-occurred with point mutations in such genes as TP53, BRCA2, or TTN (see names in brackets) and thus satisfied the first condition. Next, only some of these genes were further functionally linked to either a given MGS (indicated with an asterisk) or to a particular cancer pathway (indicated by red coloration). A few genes satisfied all three criteria: EGFR, PIK3CG, HBP1, OPN1SW, MET, and CALD1. The left chromosomal arm probably exhibited a tendency toward duplication primarily because this increased the copy number of EGFR. Variations in the other chromosomes may have affected a number of different drivers. Interestingly, in the GBM cohort, EGFR CNAs co-occurred with point mutations in the same gene (mostly of the missense type): out of 24 genomes with point mutations in EGFR, 22 also contained EGFR duplications. However, there were 72 other GBM samples that contained EGFR duplications but no EGFR point mutations.


Distinguishing between driver and passenger mutations in individual cancer genomes by network enrichment analysis.

Merid SK, Goranskaya D, Alexeyenko A - BMC Bioinformatics (2014)

Driver analysis over extended chromosomal regions. All the components of NEA were run over copy-number altered regions of chromosome 7 in the ovarian cancer. The grey areas mask the chromosomal regions omitted from this plot, so that only four selected regions are shown. Multi-colored bars in the lower plot indicate the copy numbers in individual OV genomes relative to the reference diploid genome (dotted red line). Grey lines indicate the 10th, 25th, 50th, 75th, and 90th percentiles of copy number in the OV cohort. The upper plot shows the prioritization of genes based on the–log10 of their combined p-values: genes with higher positions are more highly prioritized. Genes found to be highly significant in the 1-vs-CPW analysis are highlighted in red, those highly significant in the 1CNA-vs-MGS analysis are marked with a black asterisk, and red asterisks indicate genes highly significant in both analyses. Gene symbols in brackets indicate point mutations that co-occurred with the driver CNAs. A complete version of this figure can be found in the Additional file 2 and Additional file 3: GBM.CNA_drivers_along_chromosomes.pdf and OV.CNA_drivers_along_chromosomes.pdf.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4262241&req=5

Fig4: Driver analysis over extended chromosomal regions. All the components of NEA were run over copy-number altered regions of chromosome 7 in the ovarian cancer. The grey areas mask the chromosomal regions omitted from this plot, so that only four selected regions are shown. Multi-colored bars in the lower plot indicate the copy numbers in individual OV genomes relative to the reference diploid genome (dotted red line). Grey lines indicate the 10th, 25th, 50th, 75th, and 90th percentiles of copy number in the OV cohort. The upper plot shows the prioritization of genes based on the–log10 of their combined p-values: genes with higher positions are more highly prioritized. Genes found to be highly significant in the 1-vs-CPW analysis are highlighted in red, those highly significant in the 1CNA-vs-MGS analysis are marked with a black asterisk, and red asterisks indicate genes highly significant in both analyses. Gene symbols in brackets indicate point mutations that co-occurred with the driver CNAs. A complete version of this figure can be found in the Additional file 2 and Additional file 3: GBM.CNA_drivers_along_chromosomes.pdf and OV.CNA_drivers_along_chromosomes.pdf.
Mentions: We visualized the results of all three tests and their combined results using chromosomal maps (Figure 4 and Additional file 2: GBM.CNA_drivers_along_chromosomes.pdf and Additional file 3: OV.CNA_drivers_along_chromosomes.pdf). Figure 4 shows the results of the analysis for chromosome 7 in the OV set. While the copy numbers varied along the chromosome’s length, there were only a few regions in which these variations significantly co-occurred with point mutations in such genes as TP53, BRCA2, or TTN (see names in brackets) and thus satisfied the first condition. Next, only some of these genes were further functionally linked to either a given MGS (indicated with an asterisk) or to a particular cancer pathway (indicated by red coloration). A few genes satisfied all three criteria: EGFR, PIK3CG, HBP1, OPN1SW, MET, and CALD1. The left chromosomal arm probably exhibited a tendency toward duplication primarily because this increased the copy number of EGFR. Variations in the other chromosomes may have affected a number of different drivers. Interestingly, in the GBM cohort, EGFR CNAs co-occurred with point mutations in the same gene (mostly of the missense type): out of 24 genomes with point mutations in EGFR, 22 also contained EGFR duplications. However, there were 72 other GBM samples that contained EGFR duplications but no EGFR point mutations.

Bottom Line: We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples.On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results.We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology, Tumour and Cell biology, Bioinformatics Infrastructure for Life Sciences, Science for Life Laboratory, Karolinska Institutet, 17177 Stockholm, Sweden. andrej.alekseenko@scilifelab.se.

ABSTRACT

Background: In somatic cancer genomes, delineating genuine driver mutations against a background of multiple passenger events is a challenging task. The difficulty of determining function from sequence data and the low frequency of mutations are increasingly hindering the search for novel, less common cancer drivers. The accumulation of extensive amounts of data on somatic point and copy number alterations necessitates the development of systematic methods for driver mutation analysis.

Results: We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples. It probabilistically evaluates 1) functional network links between different mutations in the same genome and 2) links between individual mutations and known cancer pathways. In addition, it can employ correlations of mutation patterns in pairs of genes. The method was used to analyze genomic alterations in two TCGA datasets, one for glioblastoma multiforme and another for ovarian carcinoma, which were generated using different approaches to mutation profiling. The proportions of drivers among the reported de novo point mutations in these cancers were estimated to be 57.8% and 16.8%, respectively. The both sets also included extended chromosomal regions with synchronous duplications or losses of multiple genes. We identified putative copy number driver events within many such segments. Finally, we summarized seemingly disparate mutations and discovered a functional network of collagen modifications in the glioblastoma. In order to select the most efficient network for use with this method, we used a novel, ROC curve-based procedure for benchmarking different network versions by their ability to recover pathway membership.

Conclusions: The results of our network-based procedure were in good agreement with published gold standard sets of cancer genes and were shown to complement and expand frequency-based driver analyses. On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results. We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations. The software used in this work and the global network of functional couplings are publicly available at http://research.scilifelab.se/andrej_alexeyenko/downloads.html.

Show MeSH
Related in: MedlinePlus