Limits...
U87MG decoded: the genomic sequence of a cytogenetically aberrant human cancer cell line.

Clark MJ, Homer N, O'Connor BD, Chen Z, Eskin A, Lee H, Merriman B, Nelson SF - PLoS Genet. (2010)

Bottom Line: Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy.These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers.The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.

View Article: PubMed Central - PubMed

Affiliation: Department of Human Genetics, University of California Los Angeles, Los Angeles, California, United States of America.

ABSTRACT
U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30x genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate-pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100 bp), 191,743 small (<21 bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions, and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions, and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.

Show MeSH

Related in: MedlinePlus

Structural variations in U87MG.Structural variations detected by whole genome sequencing in the U87MG genome are plotted in the Circos program. Orange lines linking two chromosomes represent the 35 interchromosomal translocations. Blue lines around the edge of the circle represent microdeletions and intrachromosomal translocations. The outermost histogram represents sequence coverage and demonstrates how the boundaries of changes in coverage typically coincide with a significant structural variation.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2813426&req=5

pgen-1000832-g005: Structural variations in U87MG.Structural variations detected by whole genome sequencing in the U87MG genome are plotted in the Circos program. Orange lines linking two chromosomes represent the 35 interchromosomal translocations. Blue lines around the edge of the circle represent microdeletions and intrachromosomal translocations. The outermost histogram represents sequence coverage and demonstrates how the boundaries of changes in coverage typically coincide with a significant structural variation.

Mentions: We utilized the predictable insert distance of mate-paired sequence fragments to directly observe structural variations in U87MG. Our target insert size of 1.5kb gave us a normal distribution of paired end insert lengths ranging from 1kb to 2kb with median around 1.25kb and mean around 1.45kb in the actual sequence data (Figure S2). We identified 1,314 large structural variations, including 35 interchromosomal events, 599 complete homozygous deletions (including a large region on chromosome 9 containing CDKN2A/B, which commonly experience homozygous deletions in brain cancer), 361 heterozygous deletion events, and 319 other intrachromosomal events (Table 5). The 599 complete microdeletions summed up to approximately 5.76Mb of total sequence, while the 361 heterozygous microdeletions summed to 5.36Mb of total sequence. Most of the microdeletions were under 2kb in total size. Because of the high sequence coverage and mate pair strategy each event was supported by an average of 138 mate pair reads. Mispairing of the mate pairs did occur occasionally due to molecular chimerism in the library fabrication process, but such reads occur at a low frequency (<1/40 of the reads). Thus, the true rearrangement/deletion events were highly distinct from noise in well-mapped sequences. Interchromosomal events included translocations and large insertion/deletion events where one part of a chromosome was inserted into a different chromosome, sometimes replacing a segment of DNA. All together, these structural variations show a highly complex rearrangement of genomic material in this cancer cell line (Figure 5). All identified structural variants are summarized in Table S2. We note as well that even when breakpoints are within genome-wide common repeats there can be sufficient mapping information to reliably identify the translocation breakpoint (Figure S3).


U87MG decoded: the genomic sequence of a cytogenetically aberrant human cancer cell line.

Clark MJ, Homer N, O'Connor BD, Chen Z, Eskin A, Lee H, Merriman B, Nelson SF - PLoS Genet. (2010)

Structural variations in U87MG.Structural variations detected by whole genome sequencing in the U87MG genome are plotted in the Circos program. Orange lines linking two chromosomes represent the 35 interchromosomal translocations. Blue lines around the edge of the circle represent microdeletions and intrachromosomal translocations. The outermost histogram represents sequence coverage and demonstrates how the boundaries of changes in coverage typically coincide with a significant structural variation.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2813426&req=5

pgen-1000832-g005: Structural variations in U87MG.Structural variations detected by whole genome sequencing in the U87MG genome are plotted in the Circos program. Orange lines linking two chromosomes represent the 35 interchromosomal translocations. Blue lines around the edge of the circle represent microdeletions and intrachromosomal translocations. The outermost histogram represents sequence coverage and demonstrates how the boundaries of changes in coverage typically coincide with a significant structural variation.
Mentions: We utilized the predictable insert distance of mate-paired sequence fragments to directly observe structural variations in U87MG. Our target insert size of 1.5kb gave us a normal distribution of paired end insert lengths ranging from 1kb to 2kb with median around 1.25kb and mean around 1.45kb in the actual sequence data (Figure S2). We identified 1,314 large structural variations, including 35 interchromosomal events, 599 complete homozygous deletions (including a large region on chromosome 9 containing CDKN2A/B, which commonly experience homozygous deletions in brain cancer), 361 heterozygous deletion events, and 319 other intrachromosomal events (Table 5). The 599 complete microdeletions summed up to approximately 5.76Mb of total sequence, while the 361 heterozygous microdeletions summed to 5.36Mb of total sequence. Most of the microdeletions were under 2kb in total size. Because of the high sequence coverage and mate pair strategy each event was supported by an average of 138 mate pair reads. Mispairing of the mate pairs did occur occasionally due to molecular chimerism in the library fabrication process, but such reads occur at a low frequency (<1/40 of the reads). Thus, the true rearrangement/deletion events were highly distinct from noise in well-mapped sequences. Interchromosomal events included translocations and large insertion/deletion events where one part of a chromosome was inserted into a different chromosome, sometimes replacing a segment of DNA. All together, these structural variations show a highly complex rearrangement of genomic material in this cancer cell line (Figure 5). All identified structural variants are summarized in Table S2. We note as well that even when breakpoints are within genome-wide common repeats there can be sufficient mapping information to reliably identify the translocation breakpoint (Figure S3).

Bottom Line: Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy.These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers.The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.

View Article: PubMed Central - PubMed

Affiliation: Department of Human Genetics, University of California Los Angeles, Los Angeles, California, United States of America.

ABSTRACT
U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30x genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate-pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100 bp), 191,743 small (<21 bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions, and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions, and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.

Show MeSH
Related in: MedlinePlus