Limits...
The majority of primate-specific regulatory sequences are derived from transposable elements.

Jacques PÉ, Jeyakani J, Bourque G - PLoS Genet. (2013)

Bottom Line: We also showed that distinct subfamilies of endogenous retroviruses (ERVs) contributed significantly more accessible regions than expected by chance, with up to 80% of their instances in open chromatin.Based on these results, we further characterized 2,150 TE subfamily-transcription factor pairs that were bound in vivo or enriched for specific binding motifs, and observed that TEs contributing to open chromatin had higher levels of sequence conservation.Taken together, these results demonstrate that TEs, and in particular ERVs, have contributed hundreds of thousands of novel regulatory elements to the primate lineage and reshaped the human transcriptional landscape.

View Article: PubMed Central - PubMed

Affiliation: Computational and Systems Biology, Genome Institute of Singapore, Singapore, Singapore.

ABSTRACT
Although emerging evidence suggests that transposable elements (TEs) have contributed novel regulatory elements to the human genome, their global impact on transcriptional networks remains largely uncharacterized. Here we show that TEs have contributed to the human genome nearly half of its active elements. Using DNase I hypersensitivity data sets from ENCODE in normal, embryonic, and cancer cells, we found that 44% of open chromatin regions were in TEs and that this proportion reached 63% for primate-specific regions. We also showed that distinct subfamilies of endogenous retroviruses (ERVs) contributed significantly more accessible regions than expected by chance, with up to 80% of their instances in open chromatin. Based on these results, we further characterized 2,150 TE subfamily-transcription factor pairs that were bound in vivo or enriched for specific binding motifs, and observed that TEs contributing to open chromatin had higher levels of sequence conservation. We also showed that thousands of ERV-derived sequences were activated in a cell type-specific manner, especially in embryonic and cancer cells, and we demonstrated that this activity was associated with cell type-specific expression of neighboring genes. Taken together, these results demonstrate that TEs, and in particular ERVs, have contributed hundreds of thousands of novel regulatory elements to the primate lineage and reshaped the human transcriptional landscape.

Show MeSH

Related in: MedlinePlus

Cell type–specific expression of DAR–associated genes.(A) Distribution of the expected number of up-regulated genes in proximity to the LTR2B DAR instances in GM18265. Actual number of up-regulated genes is shown using an arrowhead. (B) UCSC genome browser view of the NAPSB gene with selected RNA-Seq and DHS ENCODE tracks (y-axis maximum set to 20 and 100 respectively). The LTR2B repeat is highlighted in pink along with its cell type-specific open chromatin and expression profiles. (C) Boxplots showing the expression values across cell types for the DAR-associated genes that are up-regulated. Red lines are connecting the expression values observed in GM18265. (D) Cell type-specific DARs have more cell type-specific expression. DARs were binned according to their cell type-specific fold enrichment and the proportion of them having a Z-score of cell type-specificity expression above 3 is shown.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3649963&req=5

pgen-1003504-g004: Cell type–specific expression of DAR–associated genes.(A) Distribution of the expected number of up-regulated genes in proximity to the LTR2B DAR instances in GM18265. Actual number of up-regulated genes is shown using an arrowhead. (B) UCSC genome browser view of the NAPSB gene with selected RNA-Seq and DHS ENCODE tracks (y-axis maximum set to 20 and 100 respectively). The LTR2B repeat is highlighted in pink along with its cell type-specific open chromatin and expression profiles. (C) Boxplots showing the expression values across cell types for the DAR-associated genes that are up-regulated. Red lines are connecting the expression values observed in GM18265. (D) Cell type-specific DARs have more cell type-specific expression. DARs were binned according to their cell type-specific fold enrichment and the proportion of them having a Z-score of cell type-specificity expression above 3 is shown.

Mentions: To evaluate the impact of DARs on gene regulation, we used 43 gene expression exon-array data sets from ENCODE and calculated the number of genes in proximity to DAR instances that were up-regulated in the relevant cell type relative to the others (see Materials and Methods). We identified 783 DARs with more proximal up-regulated genes than expected by chance (Table S6). For example, we identified 11 genes in proximity to LTR2B instances that were up-regulated in GM12865 while we would have only expected 4.27 (Figure 4A). Examples of cell type-specific LTR2B associated genes in GM12865 include NAPSB and CLECL1 (Figure 4B and Figures S12, S13), two genes that have been shown to play a role in lymphoblastoid cells [31], [32]. Moreover, we observed that the expression of the DAR-associated genes were frequently highest in the cell type where the DAR had been identified (Figure 4C and Figure S14). We also found that DARs with a higher cell type-specificity score had a higher chance of being associated with cell type-specific expression (Figure 4D). Similar results were obtained using ENCODE RNA-Seq data sets generated by Caltech (Figure S15).


The majority of primate-specific regulatory sequences are derived from transposable elements.

Jacques PÉ, Jeyakani J, Bourque G - PLoS Genet. (2013)

Cell type–specific expression of DAR–associated genes.(A) Distribution of the expected number of up-regulated genes in proximity to the LTR2B DAR instances in GM18265. Actual number of up-regulated genes is shown using an arrowhead. (B) UCSC genome browser view of the NAPSB gene with selected RNA-Seq and DHS ENCODE tracks (y-axis maximum set to 20 and 100 respectively). The LTR2B repeat is highlighted in pink along with its cell type-specific open chromatin and expression profiles. (C) Boxplots showing the expression values across cell types for the DAR-associated genes that are up-regulated. Red lines are connecting the expression values observed in GM18265. (D) Cell type-specific DARs have more cell type-specific expression. DARs were binned according to their cell type-specific fold enrichment and the proportion of them having a Z-score of cell type-specificity expression above 3 is shown.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3649963&req=5

pgen-1003504-g004: Cell type–specific expression of DAR–associated genes.(A) Distribution of the expected number of up-regulated genes in proximity to the LTR2B DAR instances in GM18265. Actual number of up-regulated genes is shown using an arrowhead. (B) UCSC genome browser view of the NAPSB gene with selected RNA-Seq and DHS ENCODE tracks (y-axis maximum set to 20 and 100 respectively). The LTR2B repeat is highlighted in pink along with its cell type-specific open chromatin and expression profiles. (C) Boxplots showing the expression values across cell types for the DAR-associated genes that are up-regulated. Red lines are connecting the expression values observed in GM18265. (D) Cell type-specific DARs have more cell type-specific expression. DARs were binned according to their cell type-specific fold enrichment and the proportion of them having a Z-score of cell type-specificity expression above 3 is shown.
Mentions: To evaluate the impact of DARs on gene regulation, we used 43 gene expression exon-array data sets from ENCODE and calculated the number of genes in proximity to DAR instances that were up-regulated in the relevant cell type relative to the others (see Materials and Methods). We identified 783 DARs with more proximal up-regulated genes than expected by chance (Table S6). For example, we identified 11 genes in proximity to LTR2B instances that were up-regulated in GM12865 while we would have only expected 4.27 (Figure 4A). Examples of cell type-specific LTR2B associated genes in GM12865 include NAPSB and CLECL1 (Figure 4B and Figures S12, S13), two genes that have been shown to play a role in lymphoblastoid cells [31], [32]. Moreover, we observed that the expression of the DAR-associated genes were frequently highest in the cell type where the DAR had been identified (Figure 4C and Figure S14). We also found that DARs with a higher cell type-specificity score had a higher chance of being associated with cell type-specific expression (Figure 4D). Similar results were obtained using ENCODE RNA-Seq data sets generated by Caltech (Figure S15).

Bottom Line: We also showed that distinct subfamilies of endogenous retroviruses (ERVs) contributed significantly more accessible regions than expected by chance, with up to 80% of their instances in open chromatin.Based on these results, we further characterized 2,150 TE subfamily-transcription factor pairs that were bound in vivo or enriched for specific binding motifs, and observed that TEs contributing to open chromatin had higher levels of sequence conservation.Taken together, these results demonstrate that TEs, and in particular ERVs, have contributed hundreds of thousands of novel regulatory elements to the primate lineage and reshaped the human transcriptional landscape.

View Article: PubMed Central - PubMed

Affiliation: Computational and Systems Biology, Genome Institute of Singapore, Singapore, Singapore.

ABSTRACT
Although emerging evidence suggests that transposable elements (TEs) have contributed novel regulatory elements to the human genome, their global impact on transcriptional networks remains largely uncharacterized. Here we show that TEs have contributed to the human genome nearly half of its active elements. Using DNase I hypersensitivity data sets from ENCODE in normal, embryonic, and cancer cells, we found that 44% of open chromatin regions were in TEs and that this proportion reached 63% for primate-specific regions. We also showed that distinct subfamilies of endogenous retroviruses (ERVs) contributed significantly more accessible regions than expected by chance, with up to 80% of their instances in open chromatin. Based on these results, we further characterized 2,150 TE subfamily-transcription factor pairs that were bound in vivo or enriched for specific binding motifs, and observed that TEs contributing to open chromatin had higher levels of sequence conservation. We also showed that thousands of ERV-derived sequences were activated in a cell type-specific manner, especially in embryonic and cancer cells, and we demonstrated that this activity was associated with cell type-specific expression of neighboring genes. Taken together, these results demonstrate that TEs, and in particular ERVs, have contributed hundreds of thousands of novel regulatory elements to the primate lineage and reshaped the human transcriptional landscape.

Show MeSH
Related in: MedlinePlus