Limits...
Estimating Exceptionally Rare Germline and Somatic Mutation Frequencies via Next Generation Sequencing.

Eboreime J, Choi SK, Yoon SR, Arnheim N, Calabrese P - PLoS ONE (2016)

Bottom Line: These rates far exceed the well documented human genome average frequency per base pair (~10-8) suggesting a non-biological explanation for our data.By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles.We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments.

View Article: PubMed Central - PubMed

Affiliation: Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089-2910, United States of America.

ABSTRACT
We used targeted next generation deep-sequencing (Safe Sequencing System) to measure ultra-rare de novo mutation frequencies in the human male germline by attaching a unique identifier code to each target DNA molecule. Segments from three different human genes (FGFR3, MECP2 and PTPN11) were studied. Regardless of the gene segment, the particular testis donor or the 73 different testis pieces used, the frequencies for any one of the six different mutation types were consistent. Averaging over the C>T/G>A and G>T/C>A mutation types the background mutation frequency was 2.6x10-5 per base pair, while for the four other mutation types the average background frequency was lower at 1.5x10-6 per base pair. These rates far exceed the well documented human genome average frequency per base pair (~10-8) suggesting a non-biological explanation for our data. By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles. Finally, we looked at a previously studied disease mutation in the PTPN11 gene and could easily distinguish true mutations from the SSS background. We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments.

No MeSH data available.


Related in: MedlinePlus

Second Round of the SSS Strategy.a. Universal forward and reverse primers are used to amplify the product made in the first round of PCR (see Fig 1E). The primers contain the Illumina Sequencing primer (ISP), which is partially complementary to the Sequencing primer region of the primers used in the first two PCR cycles (Fig 1A). The universal primers also contain the complement of the Flow cell grafting sequence (FC). b. Products of the first few cycles of the second round using the template shown in Fig 1E. c. Further amplification for 28–30 additional cycles leads to the creation of UID families of which the one shown is representative. The original genomic mutation and the mutation that arose during the first two SSS PCR cycles are both present in this particular family. Notice that one of the family members has accumulated an additional in vitro mutation (black star) due to events during the second PCR round or the final sequencing step. d. Analysis of the UID family is able to eliminate the latter error from consideration.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4920415&req=5

pone.0158340.g002: Second Round of the SSS Strategy.a. Universal forward and reverse primers are used to amplify the product made in the first round of PCR (see Fig 1E). The primers contain the Illumina Sequencing primer (ISP), which is partially complementary to the Sequencing primer region of the primers used in the first two PCR cycles (Fig 1A). The universal primers also contain the complement of the Flow cell grafting sequence (FC). b. Products of the first few cycles of the second round using the template shown in Fig 1E. c. Further amplification for 28–30 additional cycles leads to the creation of UID families of which the one shown is representative. The original genomic mutation and the mutation that arose during the first two SSS PCR cycles are both present in this particular family. Notice that one of the family members has accumulated an additional in vitro mutation (black star) due to events during the second PCR round or the final sequencing step. d. Analysis of the UID family is able to eliminate the latter error from consideration.

Mentions: When targeting specific DNA segments using SSS, a first round of two initial PCR cycles (Fig 1) uniquely tag each starting DNA molecule (one PCR cycle is defined as a denaturation step, followed by primer annealing and finishing with polymerase extension). The primers contain a target-specific sequence, a string of randomized nucleotides called the Unique IDentifier (UID), and a short universal sequence required for Illumina sequencing. The two cycles are followed by a second round of PCR that specifically expands the target population using primers complementary to the short universal sequence to form the SSS library. Using a large family of different UID sequences (~1012) in the first two cycles allows each original target genomic DNA molecule to be distinguished from the other starting target molecules. Importantly, PCR descendants of any one original genomic target molecule will share the same UID sequence (Fig 2). If only a small fraction of the final sequencing reads with the same UID has a mutation then it is most likely due to DNA isolation, library preparation or some other NGS-related error, while a high proportion with the same mutation likely means the mutation was present in the original genomic DNA molecule [1].


Estimating Exceptionally Rare Germline and Somatic Mutation Frequencies via Next Generation Sequencing.

Eboreime J, Choi SK, Yoon SR, Arnheim N, Calabrese P - PLoS ONE (2016)

Second Round of the SSS Strategy.a. Universal forward and reverse primers are used to amplify the product made in the first round of PCR (see Fig 1E). The primers contain the Illumina Sequencing primer (ISP), which is partially complementary to the Sequencing primer region of the primers used in the first two PCR cycles (Fig 1A). The universal primers also contain the complement of the Flow cell grafting sequence (FC). b. Products of the first few cycles of the second round using the template shown in Fig 1E. c. Further amplification for 28–30 additional cycles leads to the creation of UID families of which the one shown is representative. The original genomic mutation and the mutation that arose during the first two SSS PCR cycles are both present in this particular family. Notice that one of the family members has accumulated an additional in vitro mutation (black star) due to events during the second PCR round or the final sequencing step. d. Analysis of the UID family is able to eliminate the latter error from consideration.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4920415&req=5

pone.0158340.g002: Second Round of the SSS Strategy.a. Universal forward and reverse primers are used to amplify the product made in the first round of PCR (see Fig 1E). The primers contain the Illumina Sequencing primer (ISP), which is partially complementary to the Sequencing primer region of the primers used in the first two PCR cycles (Fig 1A). The universal primers also contain the complement of the Flow cell grafting sequence (FC). b. Products of the first few cycles of the second round using the template shown in Fig 1E. c. Further amplification for 28–30 additional cycles leads to the creation of UID families of which the one shown is representative. The original genomic mutation and the mutation that arose during the first two SSS PCR cycles are both present in this particular family. Notice that one of the family members has accumulated an additional in vitro mutation (black star) due to events during the second PCR round or the final sequencing step. d. Analysis of the UID family is able to eliminate the latter error from consideration.
Mentions: When targeting specific DNA segments using SSS, a first round of two initial PCR cycles (Fig 1) uniquely tag each starting DNA molecule (one PCR cycle is defined as a denaturation step, followed by primer annealing and finishing with polymerase extension). The primers contain a target-specific sequence, a string of randomized nucleotides called the Unique IDentifier (UID), and a short universal sequence required for Illumina sequencing. The two cycles are followed by a second round of PCR that specifically expands the target population using primers complementary to the short universal sequence to form the SSS library. Using a large family of different UID sequences (~1012) in the first two cycles allows each original target genomic DNA molecule to be distinguished from the other starting target molecules. Importantly, PCR descendants of any one original genomic target molecule will share the same UID sequence (Fig 2). If only a small fraction of the final sequencing reads with the same UID has a mutation then it is most likely due to DNA isolation, library preparation or some other NGS-related error, while a high proportion with the same mutation likely means the mutation was present in the original genomic DNA molecule [1].

Bottom Line: These rates far exceed the well documented human genome average frequency per base pair (~10-8) suggesting a non-biological explanation for our data.By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles.We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments.

View Article: PubMed Central - PubMed

Affiliation: Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089-2910, United States of America.

ABSTRACT
We used targeted next generation deep-sequencing (Safe Sequencing System) to measure ultra-rare de novo mutation frequencies in the human male germline by attaching a unique identifier code to each target DNA molecule. Segments from three different human genes (FGFR3, MECP2 and PTPN11) were studied. Regardless of the gene segment, the particular testis donor or the 73 different testis pieces used, the frequencies for any one of the six different mutation types were consistent. Averaging over the C>T/G>A and G>T/C>A mutation types the background mutation frequency was 2.6x10-5 per base pair, while for the four other mutation types the average background frequency was lower at 1.5x10-6 per base pair. These rates far exceed the well documented human genome average frequency per base pair (~10-8) suggesting a non-biological explanation for our data. By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles. Finally, we looked at a previously studied disease mutation in the PTPN11 gene and could easily distinguish true mutations from the SSS background. We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments.

No MeSH data available.


Related in: MedlinePlus