Limits...
Lagging-strand replication shapes the mutational landscape of the genome.

Reijns MA, Kemp H, Ding J, de Procé SM, Jackson AP, Taylor MS - Nature (2015)

Bottom Line: The origin of mutations is central to understanding evolution and of key relevance to health.Here we report that the 5' ends of Okazaki fragments have significantly increased levels of nucleotide substitution, indicating a replicative origin for such mutations.Using a novel method, emRiboSeq, we map the genome-wide contribution of polymerases, and show that despite Okazaki fragment processing, DNA synthesized by error-prone polymerase-α (Pol-α) is retained in vivo, comprising approximately 1.5% of the mature genome.

View Article: PubMed Central - PubMed

Affiliation: Medical and Developmental Genetics, MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK.

ABSTRACT
The origin of mutations is central to understanding evolution and of key relevance to health. Variation occurs non-randomly across the genome, and mechanisms for this remain to be defined. Here we report that the 5' ends of Okazaki fragments have significantly increased levels of nucleotide substitution, indicating a replicative origin for such mutations. Using a novel method, emRiboSeq, we map the genome-wide contribution of polymerases, and show that despite Okazaki fragment processing, DNA synthesized by error-prone polymerase-α (Pol-α) is retained in vivo, comprising approximately 1.5% of the mature genome. We propose that DNA-binding proteins that rapidly re-associate post-replication act as partial barriers to Pol-δ-mediated displacement of Pol-α-synthesized DNA, resulting in incorporation of such Pol-α tracts and increased mutation rates at specific sites. We observe a mutational cost to chromatin and regulatory protein binding, resulting in mutation hotspots at regulatory elements, with signatures of this process detectable in both yeast and humans.

Show MeSH

Related in: MedlinePlus

Elevated substitution rates are observed adjacent to many human TF binding sitesa-d, Nucleotide substitution rates (plotted as GERP scores) are elevated immediately adjacent to REST (a, b) and CTCF binding sites (c, d). Colour intensity shows quartiles of ChIP-seq peak height (pink to brown: lower to higher), reflecting strength of binding/occupancy. Stronger binding correlates with greater elevation of proximal substitution rate in the ‘shoulder’ region (*). Elevated substitution rates are not a consequence of local sequence composition effects (b, d). Strongest binding quartile of sites (brown) is shown compared to a 3-mer preserving shuffle (black) based on the flanking sequence (100 to 300 nt from motif mid-point) of the same genomic locations. 95% confidence intervals are shown as a brown dashed line and grey shading, respectively. e, Substitution rates plotted as GERP scores for human TF binding sites identified in ChIP-seq datasets (in conjunction with binding site motif). Sites aligned (x=0) on the mid-point of the TF binding site within the ChIP-seq peak (colours as for a-d). Dashed black line shows y=0, the genome wide expectation for neutral evolution.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4374164&req=5

Figure 11: Elevated substitution rates are observed adjacent to many human TF binding sitesa-d, Nucleotide substitution rates (plotted as GERP scores) are elevated immediately adjacent to REST (a, b) and CTCF binding sites (c, d). Colour intensity shows quartiles of ChIP-seq peak height (pink to brown: lower to higher), reflecting strength of binding/occupancy. Stronger binding correlates with greater elevation of proximal substitution rate in the ‘shoulder’ region (*). Elevated substitution rates are not a consequence of local sequence composition effects (b, d). Strongest binding quartile of sites (brown) is shown compared to a 3-mer preserving shuffle (black) based on the flanking sequence (100 to 300 nt from motif mid-point) of the same genomic locations. 95% confidence intervals are shown as a brown dashed line and grey shading, respectively. e, Substitution rates plotted as GERP scores for human TF binding sites identified in ChIP-seq datasets (in conjunction with binding site motif). Sites aligned (x=0) on the mid-point of the TF binding site within the ChIP-seq peak (colours as for a-d). Dashed black line shows y=0, the genome wide expectation for neutral evolution.

Mentions: As OF processing is a conserved process in eukaryotes, we next considered whether an OF-related mutational signature was also present in humans. Substitution rates are also elevated at nucleosome cores in humans7 with an identical distribution to yeast. Furthermore, the TF NFYA has an unexplained “shoulder” of elevated substitution proximal to its binding sites40, reminiscent of the Reb1 pattern (Fig. 1b). We therefore investigated if similar mutational patterns are present at other experimentally defined human TF and chromatin protein binding sites. Elevated inter-species nucleotide substitution rates were detected flanking essential binding site residues, for many, but not all TFs, as well as CTCF binding sites (Fig. 5a,b and Extended data Fig. 6). Substitution rates were measured using GERP scores, which quantify nucleotide substitution rates relative to a genome wide expectation of neutral evolution41, such that a negative GERP score indicates increased nucleotide substitution rates. Furthermore, elevation in mutation rate correlated with the degree of enrichment reported in exoChIP datasets for these proteins, likely reflecting the strength of binding or frequency of occupancy at specific sites, which would be expected to influence pol-δ processivity and consequent mutation levels.


Lagging-strand replication shapes the mutational landscape of the genome.

Reijns MA, Kemp H, Ding J, de Procé SM, Jackson AP, Taylor MS - Nature (2015)

Elevated substitution rates are observed adjacent to many human TF binding sitesa-d, Nucleotide substitution rates (plotted as GERP scores) are elevated immediately adjacent to REST (a, b) and CTCF binding sites (c, d). Colour intensity shows quartiles of ChIP-seq peak height (pink to brown: lower to higher), reflecting strength of binding/occupancy. Stronger binding correlates with greater elevation of proximal substitution rate in the ‘shoulder’ region (*). Elevated substitution rates are not a consequence of local sequence composition effects (b, d). Strongest binding quartile of sites (brown) is shown compared to a 3-mer preserving shuffle (black) based on the flanking sequence (100 to 300 nt from motif mid-point) of the same genomic locations. 95% confidence intervals are shown as a brown dashed line and grey shading, respectively. e, Substitution rates plotted as GERP scores for human TF binding sites identified in ChIP-seq datasets (in conjunction with binding site motif). Sites aligned (x=0) on the mid-point of the TF binding site within the ChIP-seq peak (colours as for a-d). Dashed black line shows y=0, the genome wide expectation for neutral evolution.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4374164&req=5

Figure 11: Elevated substitution rates are observed adjacent to many human TF binding sitesa-d, Nucleotide substitution rates (plotted as GERP scores) are elevated immediately adjacent to REST (a, b) and CTCF binding sites (c, d). Colour intensity shows quartiles of ChIP-seq peak height (pink to brown: lower to higher), reflecting strength of binding/occupancy. Stronger binding correlates with greater elevation of proximal substitution rate in the ‘shoulder’ region (*). Elevated substitution rates are not a consequence of local sequence composition effects (b, d). Strongest binding quartile of sites (brown) is shown compared to a 3-mer preserving shuffle (black) based on the flanking sequence (100 to 300 nt from motif mid-point) of the same genomic locations. 95% confidence intervals are shown as a brown dashed line and grey shading, respectively. e, Substitution rates plotted as GERP scores for human TF binding sites identified in ChIP-seq datasets (in conjunction with binding site motif). Sites aligned (x=0) on the mid-point of the TF binding site within the ChIP-seq peak (colours as for a-d). Dashed black line shows y=0, the genome wide expectation for neutral evolution.
Mentions: As OF processing is a conserved process in eukaryotes, we next considered whether an OF-related mutational signature was also present in humans. Substitution rates are also elevated at nucleosome cores in humans7 with an identical distribution to yeast. Furthermore, the TF NFYA has an unexplained “shoulder” of elevated substitution proximal to its binding sites40, reminiscent of the Reb1 pattern (Fig. 1b). We therefore investigated if similar mutational patterns are present at other experimentally defined human TF and chromatin protein binding sites. Elevated inter-species nucleotide substitution rates were detected flanking essential binding site residues, for many, but not all TFs, as well as CTCF binding sites (Fig. 5a,b and Extended data Fig. 6). Substitution rates were measured using GERP scores, which quantify nucleotide substitution rates relative to a genome wide expectation of neutral evolution41, such that a negative GERP score indicates increased nucleotide substitution rates. Furthermore, elevation in mutation rate correlated with the degree of enrichment reported in exoChIP datasets for these proteins, likely reflecting the strength of binding or frequency of occupancy at specific sites, which would be expected to influence pol-δ processivity and consequent mutation levels.

Bottom Line: The origin of mutations is central to understanding evolution and of key relevance to health.Here we report that the 5' ends of Okazaki fragments have significantly increased levels of nucleotide substitution, indicating a replicative origin for such mutations.Using a novel method, emRiboSeq, we map the genome-wide contribution of polymerases, and show that despite Okazaki fragment processing, DNA synthesized by error-prone polymerase-α (Pol-α) is retained in vivo, comprising approximately 1.5% of the mature genome.

View Article: PubMed Central - PubMed

Affiliation: Medical and Developmental Genetics, MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK.

ABSTRACT
The origin of mutations is central to understanding evolution and of key relevance to health. Variation occurs non-randomly across the genome, and mechanisms for this remain to be defined. Here we report that the 5' ends of Okazaki fragments have significantly increased levels of nucleotide substitution, indicating a replicative origin for such mutations. Using a novel method, emRiboSeq, we map the genome-wide contribution of polymerases, and show that despite Okazaki fragment processing, DNA synthesized by error-prone polymerase-α (Pol-α) is retained in vivo, comprising approximately 1.5% of the mature genome. We propose that DNA-binding proteins that rapidly re-associate post-replication act as partial barriers to Pol-δ-mediated displacement of Pol-α-synthesized DNA, resulting in incorporation of such Pol-α tracts and increased mutation rates at specific sites. We observe a mutational cost to chromatin and regulatory protein binding, resulting in mutation hotspots at regulatory elements, with signatures of this process detectable in both yeast and humans.

Show MeSH
Related in: MedlinePlus