Limits...
Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages.

Schmidt D, Schwalie PC, Wilson MD, Ballester B, Gonçalves A, Kutter C, Brown GD, Marshall A, Flicek P, Odom DT - Cell (2012)

Bottom Line: To gain insight into how these DNA elements are conserved and spread through the genome, we defined the full spectrum of CTCF-binding sites, including a 33/34-mer motif, and identified over five thousand highly conserved, robust, and tissue-independent CTCF-binding locations by comparing ChIP-seq data from six mammals.We discovered fossilized repeat elements flanking deeply conserved CTCF-binding regions, indicating that similar retrotransposon expansions occurred hundreds of millions of years ago.Repeat-driven dispersal of CTCF binding is a fundamental, ancient, and still highly active mechanism of genome evolution in mammalian lineages.

View Article: PubMed Central - PubMed

Affiliation: Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK.

Show MeSH
Repeat Expansions Remodeled CTCF Binding in Three Mammalian Lineages(A) Heatmap of 71 motif-words identified as highly enriched in mammalian lineages.(B) Lineage-specific repeats that are associated with the lineage-specific motif-words.(C) Venn diagram showing the number of B2 repeat-associated binding events shared between mouse and rat.(D) Frequencies of distances between the centers of M1 and M2 in all six studied species. There is a smaller spacing between M1 and M2 in mouse and rat (blue arrow), due to the B2 repeat expansion.(E) Sections of the aligned consensus sequences from CTCF-carrying retrotransposons in mouse, rat, dog, and opossum; rat and mouse contain the M1+M2 motif, dog and opossum only contain M1. Consensus motifs for CTCF binding solely based on bound repeat instances are shown below each alignment.(F) Estimated ages of lineage-specific repeats that expanded CTCF binding. White box plots are all instances of the indicated repeat; red box plots are only those bound by CTCF.See also Figure S4.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3368268&req=5

fig4: Repeat Expansions Remodeled CTCF Binding in Three Mammalian Lineages(A) Heatmap of 71 motif-words identified as highly enriched in mammalian lineages.(B) Lineage-specific repeats that are associated with the lineage-specific motif-words.(C) Venn diagram showing the number of B2 repeat-associated binding events shared between mouse and rat.(D) Frequencies of distances between the centers of M1 and M2 in all six studied species. There is a smaller spacing between M1 and M2 in mouse and rat (blue arrow), due to the B2 repeat expansion.(E) Sections of the aligned consensus sequences from CTCF-carrying retrotransposons in mouse, rat, dog, and opossum; rat and mouse contain the M1+M2 motif, dog and opossum only contain M1. Consensus motifs for CTCF binding solely based on bound repeat instances are shown below each alignment.(F) Estimated ages of lineage-specific repeats that expanded CTCF binding. White box plots are all instances of the indicated repeat; red box plots are only those bound by CTCF.See also Figure S4.

Mentions: Our genome-wide data for CTCF binding in livers of five eutherian species allowed us to identify de novo DNA sequences associated with CTCF binding at hundreds of thousands of locations. In addition to the known 20 bp motif, we further discovered a second 9 bp motif present at high frequency and with consistent spacing in each species. Both halves of the motif are unchanged across 180 million years of evolution, consistent with the high conservation of CTCF's DNA-binding domain (Figure S2), and create together a two-part 33/34 bp binding motif, which occurs in a quarter to a third of CTCF-binding events (Figures 2A and 2B). The second motif is downstream by either 21 or 22 bp from the center of the previously identified motif in approximately equal proportions in all studied species, except mouse and rat (Figure 4). Henceforth, we will refer to the canonical 20 base motif as M1 and to the 9 base motif as M2. The M2 motif has previously been found in CTCF DNase footprints, but the role of this motif is unknown (Boyle et al., 2011). The variable presence of the shorter and less information-rich M2 agrees with earlier suggestions that CTCF may have multiple binding modalities (Burcin et al., 1997; Filippova et al., 1996).


Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages.

Schmidt D, Schwalie PC, Wilson MD, Ballester B, Gonçalves A, Kutter C, Brown GD, Marshall A, Flicek P, Odom DT - Cell (2012)

Repeat Expansions Remodeled CTCF Binding in Three Mammalian Lineages(A) Heatmap of 71 motif-words identified as highly enriched in mammalian lineages.(B) Lineage-specific repeats that are associated with the lineage-specific motif-words.(C) Venn diagram showing the number of B2 repeat-associated binding events shared between mouse and rat.(D) Frequencies of distances between the centers of M1 and M2 in all six studied species. There is a smaller spacing between M1 and M2 in mouse and rat (blue arrow), due to the B2 repeat expansion.(E) Sections of the aligned consensus sequences from CTCF-carrying retrotransposons in mouse, rat, dog, and opossum; rat and mouse contain the M1+M2 motif, dog and opossum only contain M1. Consensus motifs for CTCF binding solely based on bound repeat instances are shown below each alignment.(F) Estimated ages of lineage-specific repeats that expanded CTCF binding. White box plots are all instances of the indicated repeat; red box plots are only those bound by CTCF.See also Figure S4.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3368268&req=5

fig4: Repeat Expansions Remodeled CTCF Binding in Three Mammalian Lineages(A) Heatmap of 71 motif-words identified as highly enriched in mammalian lineages.(B) Lineage-specific repeats that are associated with the lineage-specific motif-words.(C) Venn diagram showing the number of B2 repeat-associated binding events shared between mouse and rat.(D) Frequencies of distances between the centers of M1 and M2 in all six studied species. There is a smaller spacing between M1 and M2 in mouse and rat (blue arrow), due to the B2 repeat expansion.(E) Sections of the aligned consensus sequences from CTCF-carrying retrotransposons in mouse, rat, dog, and opossum; rat and mouse contain the M1+M2 motif, dog and opossum only contain M1. Consensus motifs for CTCF binding solely based on bound repeat instances are shown below each alignment.(F) Estimated ages of lineage-specific repeats that expanded CTCF binding. White box plots are all instances of the indicated repeat; red box plots are only those bound by CTCF.See also Figure S4.
Mentions: Our genome-wide data for CTCF binding in livers of five eutherian species allowed us to identify de novo DNA sequences associated with CTCF binding at hundreds of thousands of locations. In addition to the known 20 bp motif, we further discovered a second 9 bp motif present at high frequency and with consistent spacing in each species. Both halves of the motif are unchanged across 180 million years of evolution, consistent with the high conservation of CTCF's DNA-binding domain (Figure S2), and create together a two-part 33/34 bp binding motif, which occurs in a quarter to a third of CTCF-binding events (Figures 2A and 2B). The second motif is downstream by either 21 or 22 bp from the center of the previously identified motif in approximately equal proportions in all studied species, except mouse and rat (Figure 4). Henceforth, we will refer to the canonical 20 base motif as M1 and to the 9 base motif as M2. The M2 motif has previously been found in CTCF DNase footprints, but the role of this motif is unknown (Boyle et al., 2011). The variable presence of the shorter and less information-rich M2 agrees with earlier suggestions that CTCF may have multiple binding modalities (Burcin et al., 1997; Filippova et al., 1996).

Bottom Line: To gain insight into how these DNA elements are conserved and spread through the genome, we defined the full spectrum of CTCF-binding sites, including a 33/34-mer motif, and identified over five thousand highly conserved, robust, and tissue-independent CTCF-binding locations by comparing ChIP-seq data from six mammals.We discovered fossilized repeat elements flanking deeply conserved CTCF-binding regions, indicating that similar retrotransposon expansions occurred hundreds of millions of years ago.Repeat-driven dispersal of CTCF binding is a fundamental, ancient, and still highly active mechanism of genome evolution in mammalian lineages.

View Article: PubMed Central - PubMed

Affiliation: Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Robinson Way, Cambridge CB2 0RE, UK.

Show MeSH