Limits...
Estimating Exceptionally Rare Germline and Somatic Mutation Frequencies via Next Generation Sequencing.

Eboreime J, Choi SK, Yoon SR, Arnheim N, Calabrese P - PLoS ONE (2016)

Bottom Line: These rates far exceed the well documented human genome average frequency per base pair (~10-8) suggesting a non-biological explanation for our data.By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles.We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments.

View Article: PubMed Central - PubMed

Affiliation: Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089-2910, United States of America.

ABSTRACT
We used targeted next generation deep-sequencing (Safe Sequencing System) to measure ultra-rare de novo mutation frequencies in the human male germline by attaching a unique identifier code to each target DNA molecule. Segments from three different human genes (FGFR3, MECP2 and PTPN11) were studied. Regardless of the gene segment, the particular testis donor or the 73 different testis pieces used, the frequencies for any one of the six different mutation types were consistent. Averaging over the C>T/G>A and G>T/C>A mutation types the background mutation frequency was 2.6x10-5 per base pair, while for the four other mutation types the average background frequency was lower at 1.5x10-6 per base pair. These rates far exceed the well documented human genome average frequency per base pair (~10-8) suggesting a non-biological explanation for our data. By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles. Finally, we looked at a previously studied disease mutation in the PTPN11 gene and could easily distinguish true mutations from the SSS background. We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments.

No MeSH data available.


Related in: MedlinePlus

First PCR Round of the SSS strategy.a. Two primers (forward and reverse) are used for the first two SSS PCR cycles (all primers are listed in S1 Fig). Each primer (5’ to 3’) contains a Sequencing primer region (SEQ), a Unique Identifier (UID) and the forward (GENEF) or reverse (GENER) versions of the gene specific DNA sequences. Only the reverse primer contains a barcode for multiplex analysis (BC). b. Duplex genomic DNA fragment carrying a gene target that experienced an in vivo germline mutation (red circle). c. The two new daughter strands of the first denaturation, primer hybridization and primer extension. The F42 and R28 represent one of the many UIDs on the forward and reverse primers, respectively. The blue arrow indicates the direction of the reverse primer extension that includes the specific target region (blue dashes) and extends through the target as does the purple forward primer on the other strand (purple dashes). The green overhang at the extreme 5’ ends will anneal to the primers used in the second PCR round. Two different extension products are created (#1 and #2). Notice that extension product #2, with the already existing in vivo mutation is (for convenience only) also shown to experience a new in vitro mutation (black circle). In the subsequent steps, for simplicity, we only follow one of the two extension products (#2). d. The second cycle with the same primer mix, invariably involves a primer (black) with a different UID (R12) and copies DNA strand #2 (black dashes). Subsequent single strand-specific exonuclease digestion removes all the initial PCR primers, trims the 3’ DNA overhangs from the second cycle primer extension products and any remaining single-stranded genomic DNA. e. Importantly, F42-R12 is tagged by different UIDs at each end and is then subjected to the second round of PCR (see Fig 2).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4920415&req=5

pone.0158340.g001: First PCR Round of the SSS strategy.a. Two primers (forward and reverse) are used for the first two SSS PCR cycles (all primers are listed in S1 Fig). Each primer (5’ to 3’) contains a Sequencing primer region (SEQ), a Unique Identifier (UID) and the forward (GENEF) or reverse (GENER) versions of the gene specific DNA sequences. Only the reverse primer contains a barcode for multiplex analysis (BC). b. Duplex genomic DNA fragment carrying a gene target that experienced an in vivo germline mutation (red circle). c. The two new daughter strands of the first denaturation, primer hybridization and primer extension. The F42 and R28 represent one of the many UIDs on the forward and reverse primers, respectively. The blue arrow indicates the direction of the reverse primer extension that includes the specific target region (blue dashes) and extends through the target as does the purple forward primer on the other strand (purple dashes). The green overhang at the extreme 5’ ends will anneal to the primers used in the second PCR round. Two different extension products are created (#1 and #2). Notice that extension product #2, with the already existing in vivo mutation is (for convenience only) also shown to experience a new in vitro mutation (black circle). In the subsequent steps, for simplicity, we only follow one of the two extension products (#2). d. The second cycle with the same primer mix, invariably involves a primer (black) with a different UID (R12) and copies DNA strand #2 (black dashes). Subsequent single strand-specific exonuclease digestion removes all the initial PCR primers, trims the 3’ DNA overhangs from the second cycle primer extension products and any remaining single-stranded genomic DNA. e. Importantly, F42-R12 is tagged by different UIDs at each end and is then subjected to the second round of PCR (see Fig 2).

Mentions: When targeting specific DNA segments using SSS, a first round of two initial PCR cycles (Fig 1) uniquely tag each starting DNA molecule (one PCR cycle is defined as a denaturation step, followed by primer annealing and finishing with polymerase extension). The primers contain a target-specific sequence, a string of randomized nucleotides called the Unique IDentifier (UID), and a short universal sequence required for Illumina sequencing. The two cycles are followed by a second round of PCR that specifically expands the target population using primers complementary to the short universal sequence to form the SSS library. Using a large family of different UID sequences (~1012) in the first two cycles allows each original target genomic DNA molecule to be distinguished from the other starting target molecules. Importantly, PCR descendants of any one original genomic target molecule will share the same UID sequence (Fig 2). If only a small fraction of the final sequencing reads with the same UID has a mutation then it is most likely due to DNA isolation, library preparation or some other NGS-related error, while a high proportion with the same mutation likely means the mutation was present in the original genomic DNA molecule [1].


Estimating Exceptionally Rare Germline and Somatic Mutation Frequencies via Next Generation Sequencing.

Eboreime J, Choi SK, Yoon SR, Arnheim N, Calabrese P - PLoS ONE (2016)

First PCR Round of the SSS strategy.a. Two primers (forward and reverse) are used for the first two SSS PCR cycles (all primers are listed in S1 Fig). Each primer (5’ to 3’) contains a Sequencing primer region (SEQ), a Unique Identifier (UID) and the forward (GENEF) or reverse (GENER) versions of the gene specific DNA sequences. Only the reverse primer contains a barcode for multiplex analysis (BC). b. Duplex genomic DNA fragment carrying a gene target that experienced an in vivo germline mutation (red circle). c. The two new daughter strands of the first denaturation, primer hybridization and primer extension. The F42 and R28 represent one of the many UIDs on the forward and reverse primers, respectively. The blue arrow indicates the direction of the reverse primer extension that includes the specific target region (blue dashes) and extends through the target as does the purple forward primer on the other strand (purple dashes). The green overhang at the extreme 5’ ends will anneal to the primers used in the second PCR round. Two different extension products are created (#1 and #2). Notice that extension product #2, with the already existing in vivo mutation is (for convenience only) also shown to experience a new in vitro mutation (black circle). In the subsequent steps, for simplicity, we only follow one of the two extension products (#2). d. The second cycle with the same primer mix, invariably involves a primer (black) with a different UID (R12) and copies DNA strand #2 (black dashes). Subsequent single strand-specific exonuclease digestion removes all the initial PCR primers, trims the 3’ DNA overhangs from the second cycle primer extension products and any remaining single-stranded genomic DNA. e. Importantly, F42-R12 is tagged by different UIDs at each end and is then subjected to the second round of PCR (see Fig 2).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4920415&req=5

pone.0158340.g001: First PCR Round of the SSS strategy.a. Two primers (forward and reverse) are used for the first two SSS PCR cycles (all primers are listed in S1 Fig). Each primer (5’ to 3’) contains a Sequencing primer region (SEQ), a Unique Identifier (UID) and the forward (GENEF) or reverse (GENER) versions of the gene specific DNA sequences. Only the reverse primer contains a barcode for multiplex analysis (BC). b. Duplex genomic DNA fragment carrying a gene target that experienced an in vivo germline mutation (red circle). c. The two new daughter strands of the first denaturation, primer hybridization and primer extension. The F42 and R28 represent one of the many UIDs on the forward and reverse primers, respectively. The blue arrow indicates the direction of the reverse primer extension that includes the specific target region (blue dashes) and extends through the target as does the purple forward primer on the other strand (purple dashes). The green overhang at the extreme 5’ ends will anneal to the primers used in the second PCR round. Two different extension products are created (#1 and #2). Notice that extension product #2, with the already existing in vivo mutation is (for convenience only) also shown to experience a new in vitro mutation (black circle). In the subsequent steps, for simplicity, we only follow one of the two extension products (#2). d. The second cycle with the same primer mix, invariably involves a primer (black) with a different UID (R12) and copies DNA strand #2 (black dashes). Subsequent single strand-specific exonuclease digestion removes all the initial PCR primers, trims the 3’ DNA overhangs from the second cycle primer extension products and any remaining single-stranded genomic DNA. e. Importantly, F42-R12 is tagged by different UIDs at each end and is then subjected to the second round of PCR (see Fig 2).
Mentions: When targeting specific DNA segments using SSS, a first round of two initial PCR cycles (Fig 1) uniquely tag each starting DNA molecule (one PCR cycle is defined as a denaturation step, followed by primer annealing and finishing with polymerase extension). The primers contain a target-specific sequence, a string of randomized nucleotides called the Unique IDentifier (UID), and a short universal sequence required for Illumina sequencing. The two cycles are followed by a second round of PCR that specifically expands the target population using primers complementary to the short universal sequence to form the SSS library. Using a large family of different UID sequences (~1012) in the first two cycles allows each original target genomic DNA molecule to be distinguished from the other starting target molecules. Importantly, PCR descendants of any one original genomic target molecule will share the same UID sequence (Fig 2). If only a small fraction of the final sequencing reads with the same UID has a mutation then it is most likely due to DNA isolation, library preparation or some other NGS-related error, while a high proportion with the same mutation likely means the mutation was present in the original genomic DNA molecule [1].

Bottom Line: These rates far exceed the well documented human genome average frequency per base pair (~10-8) suggesting a non-biological explanation for our data.By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles.We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments.

View Article: PubMed Central - PubMed

Affiliation: Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089-2910, United States of America.

ABSTRACT
We used targeted next generation deep-sequencing (Safe Sequencing System) to measure ultra-rare de novo mutation frequencies in the human male germline by attaching a unique identifier code to each target DNA molecule. Segments from three different human genes (FGFR3, MECP2 and PTPN11) were studied. Regardless of the gene segment, the particular testis donor or the 73 different testis pieces used, the frequencies for any one of the six different mutation types were consistent. Averaging over the C>T/G>A and G>T/C>A mutation types the background mutation frequency was 2.6x10-5 per base pair, while for the four other mutation types the average background frequency was lower at 1.5x10-6 per base pair. These rates far exceed the well documented human genome average frequency per base pair (~10-8) suggesting a non-biological explanation for our data. By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles. Finally, we looked at a previously studied disease mutation in the PTPN11 gene and could easily distinguish true mutations from the SSS background. We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments.

No MeSH data available.


Related in: MedlinePlus