Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages.
Bottom Line: To gain insight into how these DNA elements are conserved and spread through the genome, we defined the full spectrum of CTCF-binding sites, including a 33/34-mer motif, and identified over five thousand highly conserved, robust, and tissue-independent CTCF-binding locations by comparing ChIP-seq data from six mammals.We discovered fossilized repeat elements flanking deeply conserved CTCF-binding regions, indicating that similar retrotransposon expansions occurred hundreds of millions of years ago.Repeat-driven dispersal of CTCF binding is a fundamental, ancient, and still highly active mechanism of genome evolution in mammalian lineages.
Affiliation: Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK.Show MeSH
Mentions: Our genome-wide data for CTCF binding in livers of five eutherian species allowed us to identify de novo DNA sequences associated with CTCF binding at hundreds of thousands of locations. In addition to the known 20 bp motif, we further discovered a second 9 bp motif present at high frequency and with consistent spacing in each species. Both halves of the motif are unchanged across 180 million years of evolution, consistent with the high conservation of CTCF's DNA-binding domain (Figure S2), and create together a two-part 33/34 bp binding motif, which occurs in a quarter to a third of CTCF-binding events (Figures 2A and 2B). The second motif is downstream by either 21 or 22 bp from the center of the previously identified motif in approximately equal proportions in all studied species, except mouse and rat (Figure 4). Henceforth, we will refer to the canonical 20 base motif as M1 and to the 9 base motif as M2. The M2 motif has previously been found in CTCF DNase footprints, but the role of this motif is unknown (Boyle et al., 2011). The variable presence of the shorter and less information-rich M2 agrees with earlier suggestions that CTCF may have multiple binding modalities (Burcin et al., 1997; Filippova et al., 1996).
Affiliation: Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK.