Limits...
Discovery and saturation analysis of cancer genes across 21 tumour types.

Lawrence MS, Stojanov P, Mermel CH, Robinson JT, Garraway LA, Golub TR, Meyerson M, Gabriel SB, Lander ES, Getz G - Nature (2014)

Bottom Line: We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types.Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies.We estimate that near-saturation may be achieved with 600-5,000 samples per tumour type, depending on background mutation frequency.

View Article: PubMed Central - PubMed

Affiliation: Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA.

ABSTRACT
Although a few cancer genes are mutated in a high proportion of tumours of a given type (>20%), most are mutated at intermediate frequencies (2-20%). To explore the feasibility of creating a comprehensive catalogue of cancer genes, we analysed somatic point mutations in exome sequences from 4,742 human cancers and their matched normal-tissue samples across 21 cancer types. We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types. Our analysis also identified 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies. We estimate that near-saturation may be achieved with 600-5,000 samples per tumour type, depending on background mutation frequency. The results may help to guide the next stage of cancer genomics.

Show MeSH

Related in: MedlinePlus

Down-sampling analysis shows that gene discovery is continuing as samples and tumor types are added. a. Analysis within tumor types. Each point represents a random subset of patients. Blue line is a smoothed fit. b. Analysis by adding tumor types. Each grey line represents a random ordering of the 21 tumor types. c. Analysis by adding samples. Each point is a random subset of the 4742 patients. d. Analysis in panel c broken down by mutation frequency. Genes mutated at frequencies ≥ 20% are nearing saturation, while intermediate frequencies show steep growth. See also Supplementary Figures 7, 8.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4048962&req=5

Figure 4: Down-sampling analysis shows that gene discovery is continuing as samples and tumor types are added. a. Analysis within tumor types. Each point represents a random subset of patients. Blue line is a smoothed fit. b. Analysis by adding tumor types. Each grey line represents a random ordering of the 21 tumor types. c. Analysis by adding samples. Each point is a random subset of the 4742 patients. d. Analysis in panel c broken down by mutation frequency. Genes mutated at frequencies ≥ 20% are nearing saturation, while intermediate frequencies show steep growth. See also Supplementary Figures 7, 8.

Mentions: For each tumor type (omitting those with five or fewer candidate cancer genes), the number of genes increases roughly linearly with sample size (examples in Figure 4a; see also Supplementary Figure 7) – indicating that the inventory for each of the tumor types is far from complete. The number of genes also increases linearly with the number of tumor types studied (Figure 4b), suggesting that it is valuable to increase both the sample size per tumor type and the number of tumor types.


Discovery and saturation analysis of cancer genes across 21 tumour types.

Lawrence MS, Stojanov P, Mermel CH, Robinson JT, Garraway LA, Golub TR, Meyerson M, Gabriel SB, Lander ES, Getz G - Nature (2014)

Down-sampling analysis shows that gene discovery is continuing as samples and tumor types are added. a. Analysis within tumor types. Each point represents a random subset of patients. Blue line is a smoothed fit. b. Analysis by adding tumor types. Each grey line represents a random ordering of the 21 tumor types. c. Analysis by adding samples. Each point is a random subset of the 4742 patients. d. Analysis in panel c broken down by mutation frequency. Genes mutated at frequencies ≥ 20% are nearing saturation, while intermediate frequencies show steep growth. See also Supplementary Figures 7, 8.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4048962&req=5

Figure 4: Down-sampling analysis shows that gene discovery is continuing as samples and tumor types are added. a. Analysis within tumor types. Each point represents a random subset of patients. Blue line is a smoothed fit. b. Analysis by adding tumor types. Each grey line represents a random ordering of the 21 tumor types. c. Analysis by adding samples. Each point is a random subset of the 4742 patients. d. Analysis in panel c broken down by mutation frequency. Genes mutated at frequencies ≥ 20% are nearing saturation, while intermediate frequencies show steep growth. See also Supplementary Figures 7, 8.
Mentions: For each tumor type (omitting those with five or fewer candidate cancer genes), the number of genes increases roughly linearly with sample size (examples in Figure 4a; see also Supplementary Figure 7) – indicating that the inventory for each of the tumor types is far from complete. The number of genes also increases linearly with the number of tumor types studied (Figure 4b), suggesting that it is valuable to increase both the sample size per tumor type and the number of tumor types.

Bottom Line: We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types.Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies.We estimate that near-saturation may be achieved with 600-5,000 samples per tumour type, depending on background mutation frequency.

View Article: PubMed Central - PubMed

Affiliation: Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA.

ABSTRACT
Although a few cancer genes are mutated in a high proportion of tumours of a given type (>20%), most are mutated at intermediate frequencies (2-20%). To explore the feasibility of creating a comprehensive catalogue of cancer genes, we analysed somatic point mutations in exome sequences from 4,742 human cancers and their matched normal-tissue samples across 21 cancer types. We found that large-scale genomic analysis can identify nearly all known cancer genes in these tumour types. Our analysis also identified 33 genes that were not previously known to be significantly mutated in cancer, including genes related to proliferation, apoptosis, genome stability, chromatin regulation, immune evasion, RNA processing and protein homeostasis. Down-sampling analysis indicates that larger sample sizes will reveal many more genes mutated at clinically important frequencies. We estimate that near-saturation may be achieved with 600-5,000 samples per tumour type, depending on background mutation frequency. The results may help to guide the next stage of cancer genomics.

Show MeSH
Related in: MedlinePlus