Limits...
A somatic reference standard for cancer genome sequencing.

Craig DW, Nasser S, Corbett R, Chan SK, Murray L, Legendre C, Tembe W, Adkins J, Kim N, Wong S, Baker A, Enriquez D, Pond S, Pleasance E, Mungall AJ, Moore RA, McDaniel T, Ma Y, Jones SJ, Marra MA, Carpten JD, Liang WS - Sci Rep (2016)

Bottom Line: Aggregate variant detection led to the identification of consensus variants, including key events that represent hallmark mutation types including amplified BRAF V600E, a CDK2NA small deletion, a 12 kb PTEN deletion, and a dinucleotide TERT promoter substitution.Overall, common events include >35,000 point mutations, 446 small insertion/deletions, and >6,000 genes affected by copy number changes.We present this reference to the community as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions.

View Article: PubMed Central - PubMed

Affiliation: Translational Genomics Research Institute, Phoenix, Arizona, USA.

ABSTRACT
Large-scale multiplexed identification of somatic alterations in cancer has become feasible with next generation sequencing (NGS). However, calibration of NGS somatic analysis tools has been hampered by a lack of tumor/normal reference standards. We thus performed paired PCR-free whole genome sequencing of a matched metastatic melanoma cell line (COLO829) and normal across three lineages and across separate institutions, with independent library preparations, sequencing, and analysis. We generated mean mapped coverages of 99X for COLO829 and 103X for the paired normal across three institutions. Results were combined with previously generated data allowing for comparison to a fourth lineage on earlier NGS technology. Aggregate variant detection led to the identification of consensus variants, including key events that represent hallmark mutation types including amplified BRAF V600E, a CDK2NA small deletion, a 12 kb PTEN deletion, and a dinucleotide TERT promoter substitution. Overall, common events include >35,000 point mutations, 446 small insertion/deletions, and >6,000 genes affected by copy number changes. We present this reference to the community as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions.

No MeSH data available.


Related in: MedlinePlus

Construction of a somatic truth set for COLO829.(A) Identification of somatic reference SNVs. The total numbers of coding SNVs present in each truth set are shown. (B) Final somatic reference standard. Selected events are shown. Somatic coding SNVs are shown as black tick marks within the outermost ring. Consensus CNV gains are shown in green and consensus CNV losses are shown in red in the innermost circle.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4837349&req=5

f2: Construction of a somatic truth set for COLO829.(A) Identification of somatic reference SNVs. The total numbers of coding SNVs present in each truth set are shown. (B) Final somatic reference standard. Selected events are shown. Somatic coding SNVs are shown as black tick marks within the outermost ring. Consensus CNV gains are shown in green and consensus CNV losses are shown in red in the innermost circle.

Mentions: WGS metrics and summary findings for each institution are listed in Table 1. Over 18 billion mapped reads were generated across all PCR-free whole genome data sets, including data generated from Pleasance et al.10, from the Translational Genomics Research Institute (TGen), Canada’s Michael Smith Genome Sciences Centre (GSC) in British Columbia, and from Illumina, Inc. Mean mapped sequence coverages of properly paired reads of 83X was obtained for both COLO829BL and COLO829 across all four data sets with the 2010 Pleasance data set demonstrating the lowest coverages. An overview of data generation and collation is shown in Fig. 1. To construct the somatic reference standard, data generated from DNA from separate cell culture preparations of the tumor/normal pair were independently analyzed by the TGen, GSC, and Illumina analytical pipelines. Consensus variants were compiled as a gVCF (genome variant call format) to distinguish true positives, true negatives, and no calls within each lineage. Somatic variants called by at least 2 out of the 3 pipelines were first selected to generate the true positive truth set for each of the four cell preparations. Somatic events commonly called across all four truth sets were used to generate the progenitor somatic reference standard (Fig. 2B). Under circumstances whereby a somatic event is present in at least 3 of 4 truth sets and for which the fourth truth set has low coverage (depth of coverage [DOC]<20 reads), the alteration was also included in the somatic reference standard.


A somatic reference standard for cancer genome sequencing.

Craig DW, Nasser S, Corbett R, Chan SK, Murray L, Legendre C, Tembe W, Adkins J, Kim N, Wong S, Baker A, Enriquez D, Pond S, Pleasance E, Mungall AJ, Moore RA, McDaniel T, Ma Y, Jones SJ, Marra MA, Carpten JD, Liang WS - Sci Rep (2016)

Construction of a somatic truth set for COLO829.(A) Identification of somatic reference SNVs. The total numbers of coding SNVs present in each truth set are shown. (B) Final somatic reference standard. Selected events are shown. Somatic coding SNVs are shown as black tick marks within the outermost ring. Consensus CNV gains are shown in green and consensus CNV losses are shown in red in the innermost circle.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4837349&req=5

f2: Construction of a somatic truth set for COLO829.(A) Identification of somatic reference SNVs. The total numbers of coding SNVs present in each truth set are shown. (B) Final somatic reference standard. Selected events are shown. Somatic coding SNVs are shown as black tick marks within the outermost ring. Consensus CNV gains are shown in green and consensus CNV losses are shown in red in the innermost circle.
Mentions: WGS metrics and summary findings for each institution are listed in Table 1. Over 18 billion mapped reads were generated across all PCR-free whole genome data sets, including data generated from Pleasance et al.10, from the Translational Genomics Research Institute (TGen), Canada’s Michael Smith Genome Sciences Centre (GSC) in British Columbia, and from Illumina, Inc. Mean mapped sequence coverages of properly paired reads of 83X was obtained for both COLO829BL and COLO829 across all four data sets with the 2010 Pleasance data set demonstrating the lowest coverages. An overview of data generation and collation is shown in Fig. 1. To construct the somatic reference standard, data generated from DNA from separate cell culture preparations of the tumor/normal pair were independently analyzed by the TGen, GSC, and Illumina analytical pipelines. Consensus variants were compiled as a gVCF (genome variant call format) to distinguish true positives, true negatives, and no calls within each lineage. Somatic variants called by at least 2 out of the 3 pipelines were first selected to generate the true positive truth set for each of the four cell preparations. Somatic events commonly called across all four truth sets were used to generate the progenitor somatic reference standard (Fig. 2B). Under circumstances whereby a somatic event is present in at least 3 of 4 truth sets and for which the fourth truth set has low coverage (depth of coverage [DOC]<20 reads), the alteration was also included in the somatic reference standard.

Bottom Line: Aggregate variant detection led to the identification of consensus variants, including key events that represent hallmark mutation types including amplified BRAF V600E, a CDK2NA small deletion, a 12 kb PTEN deletion, and a dinucleotide TERT promoter substitution.Overall, common events include >35,000 point mutations, 446 small insertion/deletions, and >6,000 genes affected by copy number changes.We present this reference to the community as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions.

View Article: PubMed Central - PubMed

Affiliation: Translational Genomics Research Institute, Phoenix, Arizona, USA.

ABSTRACT
Large-scale multiplexed identification of somatic alterations in cancer has become feasible with next generation sequencing (NGS). However, calibration of NGS somatic analysis tools has been hampered by a lack of tumor/normal reference standards. We thus performed paired PCR-free whole genome sequencing of a matched metastatic melanoma cell line (COLO829) and normal across three lineages and across separate institutions, with independent library preparations, sequencing, and analysis. We generated mean mapped coverages of 99X for COLO829 and 103X for the paired normal across three institutions. Results were combined with previously generated data allowing for comparison to a fourth lineage on earlier NGS technology. Aggregate variant detection led to the identification of consensus variants, including key events that represent hallmark mutation types including amplified BRAF V600E, a CDK2NA small deletion, a 12 kb PTEN deletion, and a dinucleotide TERT promoter substitution. Overall, common events include >35,000 point mutations, 446 small insertion/deletions, and >6,000 genes affected by copy number changes. We present this reference to the community as an initial standard for enabling quantitative evaluation of somatic mutation pipelines across institutions.

No MeSH data available.


Related in: MedlinePlus