Limits...
Distinguishing between driver and passenger mutations in individual cancer genomes by network enrichment analysis.

Merid SK, Goranskaya D, Alexeyenko A - BMC Bioinformatics (2014)

Bottom Line: We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples.On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results.We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology, Tumour and Cell biology, Bioinformatics Infrastructure for Life Sciences, Science for Life Laboratory, Karolinska Institutet, 17177 Stockholm, Sweden. andrej.alekseenko@scilifelab.se.

ABSTRACT

Background: In somatic cancer genomes, delineating genuine driver mutations against a background of multiple passenger events is a challenging task. The difficulty of determining function from sequence data and the low frequency of mutations are increasingly hindering the search for novel, less common cancer drivers. The accumulation of extensive amounts of data on somatic point and copy number alterations necessitates the development of systematic methods for driver mutation analysis.

Results: We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples. It probabilistically evaluates 1) functional network links between different mutations in the same genome and 2) links between individual mutations and known cancer pathways. In addition, it can employ correlations of mutation patterns in pairs of genes. The method was used to analyze genomic alterations in two TCGA datasets, one for glioblastoma multiforme and another for ovarian carcinoma, which were generated using different approaches to mutation profiling. The proportions of drivers among the reported de novo point mutations in these cancers were estimated to be 57.8% and 16.8%, respectively. The both sets also included extended chromosomal regions with synchronous duplications or losses of multiple genes. We identified putative copy number driver events within many such segments. Finally, we summarized seemingly disparate mutations and discovered a functional network of collagen modifications in the glioblastoma. In order to select the most efficient network for use with this method, we used a novel, ROC curve-based procedure for benchmarking different network versions by their ability to recover pathway membership.

Conclusions: The results of our network-based procedure were in good agreement with published gold standard sets of cancer genes and were shown to complement and expand frequency-based driver analyses. On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results. We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations. The software used in this work and the global network of functional couplings are publicly available at http://research.scilifelab.se/andrej_alexeyenko/downloads.html.

Show MeSH

Related in: MedlinePlus

Schematic representation of the network enrichment analysis applied to detection of driver mutations. A, total quantification of inter-relations between somatic point mutations (PM) in one genome. B, test of a single point mutation for being related to all other PMs (1point-vs-MGS). C, test of a copy number alteration (CNA) against all PMs in the same genome (1CNA-vs-MGS). D, test of either a CNA or PM against a known cancer pathway (CP), irrespective of genome (1-vs-CPW). E, overview of the algorithm. The analyses at B, C, and D were summarized into a single combined p-value for each gene copy number change (yellow) and point mutation (red).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4262241&req=5

Fig1: Schematic representation of the network enrichment analysis applied to detection of driver mutations. A, total quantification of inter-relations between somatic point mutations (PM) in one genome. B, test of a single point mutation for being related to all other PMs (1point-vs-MGS). C, test of a copy number alteration (CNA) against all PMs in the same genome (1CNA-vs-MGS). D, test of either a CNA or PM against a known cancer pathway (CP), irrespective of genome (1-vs-CPW). E, overview of the algorithm. The analyses at B, C, and D were summarized into a single combined p-value for each gene copy number change (yellow) and point mutation (red).

Mentions: The multiplicity of observed somatic mutations in most cancer genomes indicates that the emergence of cancer might require perturbations at multiple network points. This conjecture was confirmed in our previous work [25]: many individual, tumor-specific sets of somatic mutations in GBM exhibited coherence in the global network context when analyzed as whole groups (or mutated gene sets, MGS). A representative case is shown in Figure 1A. This coherence was demonstrated by the presence of a greater number of connections between simultaneously mutated genes than the number expected by chance alone (analysis details for the GBM and OV sets are given under the heading “Coherence of genome-specific sets of point mutations” in the Methods section). This allows us to suggest that MGSs could be used as functional gene sets needed for the NEA tests in the current work. Each particular mutation in the MGSs may be either a passenger, and then no enrichment to the rest of MGS should be detected, or a driver, and then we should obtain a significant network enrichment score (if the global network contained relevant edges). In addition to using MGSs, we can test each mutation against known cancer pathways. In this case, we expect that the mutation interacts with pathway genes, while the latter are not necessarily mutated in this genome. Thus, we applied three modes of NEA in parallel, independently of each other (illustrated in panels B, C, and D of Figure 1), and combined their results at the last step (Figure 1E). In these modes the individual genomic alterations (i.e. point mutations or copy number changes that could influence protein-coding genes) were evaluated against:Figure 1


Distinguishing between driver and passenger mutations in individual cancer genomes by network enrichment analysis.

Merid SK, Goranskaya D, Alexeyenko A - BMC Bioinformatics (2014)

Schematic representation of the network enrichment analysis applied to detection of driver mutations. A, total quantification of inter-relations between somatic point mutations (PM) in one genome. B, test of a single point mutation for being related to all other PMs (1point-vs-MGS). C, test of a copy number alteration (CNA) against all PMs in the same genome (1CNA-vs-MGS). D, test of either a CNA or PM against a known cancer pathway (CP), irrespective of genome (1-vs-CPW). E, overview of the algorithm. The analyses at B, C, and D were summarized into a single combined p-value for each gene copy number change (yellow) and point mutation (red).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4262241&req=5

Fig1: Schematic representation of the network enrichment analysis applied to detection of driver mutations. A, total quantification of inter-relations between somatic point mutations (PM) in one genome. B, test of a single point mutation for being related to all other PMs (1point-vs-MGS). C, test of a copy number alteration (CNA) against all PMs in the same genome (1CNA-vs-MGS). D, test of either a CNA or PM against a known cancer pathway (CP), irrespective of genome (1-vs-CPW). E, overview of the algorithm. The analyses at B, C, and D were summarized into a single combined p-value for each gene copy number change (yellow) and point mutation (red).
Mentions: The multiplicity of observed somatic mutations in most cancer genomes indicates that the emergence of cancer might require perturbations at multiple network points. This conjecture was confirmed in our previous work [25]: many individual, tumor-specific sets of somatic mutations in GBM exhibited coherence in the global network context when analyzed as whole groups (or mutated gene sets, MGS). A representative case is shown in Figure 1A. This coherence was demonstrated by the presence of a greater number of connections between simultaneously mutated genes than the number expected by chance alone (analysis details for the GBM and OV sets are given under the heading “Coherence of genome-specific sets of point mutations” in the Methods section). This allows us to suggest that MGSs could be used as functional gene sets needed for the NEA tests in the current work. Each particular mutation in the MGSs may be either a passenger, and then no enrichment to the rest of MGS should be detected, or a driver, and then we should obtain a significant network enrichment score (if the global network contained relevant edges). In addition to using MGSs, we can test each mutation against known cancer pathways. In this case, we expect that the mutation interacts with pathway genes, while the latter are not necessarily mutated in this genome. Thus, we applied three modes of NEA in parallel, independently of each other (illustrated in panels B, C, and D of Figure 1), and combined their results at the last step (Figure 1E). In these modes the individual genomic alterations (i.e. point mutations or copy number changes that could influence protein-coding genes) were evaluated against:Figure 1

Bottom Line: We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples.On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results.We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations.

View Article: PubMed Central - PubMed

Affiliation: Department of Microbiology, Tumour and Cell biology, Bioinformatics Infrastructure for Life Sciences, Science for Life Laboratory, Karolinska Institutet, 17177 Stockholm, Sweden. andrej.alekseenko@scilifelab.se.

ABSTRACT

Background: In somatic cancer genomes, delineating genuine driver mutations against a background of multiple passenger events is a challenging task. The difficulty of determining function from sequence data and the low frequency of mutations are increasingly hindering the search for novel, less common cancer drivers. The accumulation of extensive amounts of data on somatic point and copy number alterations necessitates the development of systematic methods for driver mutation analysis.

Results: We introduce a framework for detecting driver mutations via functional network analysis, which is applied to individual genomes and does not require pooling multiple samples. It probabilistically evaluates 1) functional network links between different mutations in the same genome and 2) links between individual mutations and known cancer pathways. In addition, it can employ correlations of mutation patterns in pairs of genes. The method was used to analyze genomic alterations in two TCGA datasets, one for glioblastoma multiforme and another for ovarian carcinoma, which were generated using different approaches to mutation profiling. The proportions of drivers among the reported de novo point mutations in these cancers were estimated to be 57.8% and 16.8%, respectively. The both sets also included extended chromosomal regions with synchronous duplications or losses of multiple genes. We identified putative copy number driver events within many such segments. Finally, we summarized seemingly disparate mutations and discovered a functional network of collagen modifications in the glioblastoma. In order to select the most efficient network for use with this method, we used a novel, ROC curve-based procedure for benchmarking different network versions by their ability to recover pathway membership.

Conclusions: The results of our network-based procedure were in good agreement with published gold standard sets of cancer genes and were shown to complement and expand frequency-based driver analyses. On the other hand, three sequence-based methods applied to the same data yielded poor agreement with each other and with our results. We review the difference in driver proportions discovered by different sequencing approaches and discuss the functional roles of novel driver mutations. The software used in this work and the global network of functional couplings are publicly available at http://research.scilifelab.se/andrej_alexeyenko/downloads.html.

Show MeSH
Related in: MedlinePlus