Limits...
Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages.

Schmidt D, Schwalie PC, Wilson MD, Ballester B, Gonçalves A, Kutter C, Brown GD, Marshall A, Flicek P, Odom DT - Cell (2012)

Bottom Line: To gain insight into how these DNA elements are conserved and spread through the genome, we defined the full spectrum of CTCF-binding sites, including a 33/34-mer motif, and identified over five thousand highly conserved, robust, and tissue-independent CTCF-binding locations by comparing ChIP-seq data from six mammals.We discovered fossilized repeat elements flanking deeply conserved CTCF-binding regions, indicating that similar retrotransposon expansions occurred hundreds of millions of years ago.Repeat-driven dispersal of CTCF binding is a fundamental, ancient, and still highly active mechanism of genome evolution in mammalian lineages.

View Article: PubMed Central - PubMed

Affiliation: Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK.

Show MeSH

Related in: MedlinePlus

CTCF Motifs (M1 and M2) and Motif Occurrences, Related to Figure 2(A) Motifs identified de novo from CTCF-binding events in all six species.(B) Different properties of CTCF-binding events dependent on the presence of M2.(C) Read profile at CTCF-binding events where only the M1 motif (black line) or the complete two-part motif consisting of M1 and M2 was detected (red line).(D) Binding event counts and number of binding events with at least one motif (M1 and M1+M2) in all six species.(E) Presence and absence of M1 and M2 in two DNA sequences from Filippova et al. (1996). The motif score (nmscan uses bits-suboptimal scoring with 0.0 being a perfect match) is indicated under each motif instance.(F) A multiple mammalian sequence alignment of a CTCF peak at the APP gene is shown. The DNase I footprint (red box, Quitschke et al., 2000) encompasses a complete 34 bp M1 and M2 CTCF motif.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3368268&req=5

figs2: CTCF Motifs (M1 and M2) and Motif Occurrences, Related to Figure 2(A) Motifs identified de novo from CTCF-binding events in all six species.(B) Different properties of CTCF-binding events dependent on the presence of M2.(C) Read profile at CTCF-binding events where only the M1 motif (black line) or the complete two-part motif consisting of M1 and M2 was detected (red line).(D) Binding event counts and number of binding events with at least one motif (M1 and M1+M2) in all six species.(E) Presence and absence of M1 and M2 in two DNA sequences from Filippova et al. (1996). The motif score (nmscan uses bits-suboptimal scoring with 0.0 being a perfect match) is indicated under each motif instance.(F) A multiple mammalian sequence alignment of a CTCF peak at the APP gene is shown. The DNase I footprint (red box, Quitschke et al., 2000) encompasses a complete 34 bp M1 and M2 CTCF motif.

Mentions: Our genome-wide data for CTCF binding in livers of five eutherian species allowed us to identify de novo DNA sequences associated with CTCF binding at hundreds of thousands of locations. In addition to the known 20 bp motif, we further discovered a second 9 bp motif present at high frequency and with consistent spacing in each species. Both halves of the motif are unchanged across 180 million years of evolution, consistent with the high conservation of CTCF's DNA-binding domain (Figure S2), and create together a two-part 33/34 bp binding motif, which occurs in a quarter to a third of CTCF-binding events (Figures 2A and 2B). The second motif is downstream by either 21 or 22 bp from the center of the previously identified motif in approximately equal proportions in all studied species, except mouse and rat (Figure 4). Henceforth, we will refer to the canonical 20 base motif as M1 and to the 9 base motif as M2. The M2 motif has previously been found in CTCF DNase footprints, but the role of this motif is unknown (Boyle et al., 2011). The variable presence of the shorter and less information-rich M2 agrees with earlier suggestions that CTCF may have multiple binding modalities (Burcin et al., 1997; Filippova et al., 1996).


Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages.

Schmidt D, Schwalie PC, Wilson MD, Ballester B, Gonçalves A, Kutter C, Brown GD, Marshall A, Flicek P, Odom DT - Cell (2012)

CTCF Motifs (M1 and M2) and Motif Occurrences, Related to Figure 2(A) Motifs identified de novo from CTCF-binding events in all six species.(B) Different properties of CTCF-binding events dependent on the presence of M2.(C) Read profile at CTCF-binding events where only the M1 motif (black line) or the complete two-part motif consisting of M1 and M2 was detected (red line).(D) Binding event counts and number of binding events with at least one motif (M1 and M1+M2) in all six species.(E) Presence and absence of M1 and M2 in two DNA sequences from Filippova et al. (1996). The motif score (nmscan uses bits-suboptimal scoring with 0.0 being a perfect match) is indicated under each motif instance.(F) A multiple mammalian sequence alignment of a CTCF peak at the APP gene is shown. The DNase I footprint (red box, Quitschke et al., 2000) encompasses a complete 34 bp M1 and M2 CTCF motif.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3368268&req=5

figs2: CTCF Motifs (M1 and M2) and Motif Occurrences, Related to Figure 2(A) Motifs identified de novo from CTCF-binding events in all six species.(B) Different properties of CTCF-binding events dependent on the presence of M2.(C) Read profile at CTCF-binding events where only the M1 motif (black line) or the complete two-part motif consisting of M1 and M2 was detected (red line).(D) Binding event counts and number of binding events with at least one motif (M1 and M1+M2) in all six species.(E) Presence and absence of M1 and M2 in two DNA sequences from Filippova et al. (1996). The motif score (nmscan uses bits-suboptimal scoring with 0.0 being a perfect match) is indicated under each motif instance.(F) A multiple mammalian sequence alignment of a CTCF peak at the APP gene is shown. The DNase I footprint (red box, Quitschke et al., 2000) encompasses a complete 34 bp M1 and M2 CTCF motif.
Mentions: Our genome-wide data for CTCF binding in livers of five eutherian species allowed us to identify de novo DNA sequences associated with CTCF binding at hundreds of thousands of locations. In addition to the known 20 bp motif, we further discovered a second 9 bp motif present at high frequency and with consistent spacing in each species. Both halves of the motif are unchanged across 180 million years of evolution, consistent with the high conservation of CTCF's DNA-binding domain (Figure S2), and create together a two-part 33/34 bp binding motif, which occurs in a quarter to a third of CTCF-binding events (Figures 2A and 2B). The second motif is downstream by either 21 or 22 bp from the center of the previously identified motif in approximately equal proportions in all studied species, except mouse and rat (Figure 4). Henceforth, we will refer to the canonical 20 base motif as M1 and to the 9 base motif as M2. The M2 motif has previously been found in CTCF DNase footprints, but the role of this motif is unknown (Boyle et al., 2011). The variable presence of the shorter and less information-rich M2 agrees with earlier suggestions that CTCF may have multiple binding modalities (Burcin et al., 1997; Filippova et al., 1996).

Bottom Line: To gain insight into how these DNA elements are conserved and spread through the genome, we defined the full spectrum of CTCF-binding sites, including a 33/34-mer motif, and identified over five thousand highly conserved, robust, and tissue-independent CTCF-binding locations by comparing ChIP-seq data from six mammals.We discovered fossilized repeat elements flanking deeply conserved CTCF-binding regions, indicating that similar retrotransposon expansions occurred hundreds of millions of years ago.Repeat-driven dispersal of CTCF binding is a fundamental, ancient, and still highly active mechanism of genome evolution in mammalian lineages.

View Article: PubMed Central - PubMed

Affiliation: Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK.

Show MeSH
Related in: MedlinePlus