Limits...
Evolutionary acquisition of promoter-associated non-coding RNA (pancRNA) repertoires diversifies species-dependent gene activation mechanisms in mammals

View Article: PubMed Central - PubMed

ABSTRACT

Background: Recent transcriptome analyses have shown that long non-coding RNAs (ncRNAs) play extensive roles in transcriptional regulation. In particular, we have reported that promoter-associated ncRNAs (pancRNAs) activate the partner gene expression via local epigenetic changes.

Results: Here, we identify thousands of genes under pancRNA-mediated transcriptional activation in five mammalian species in common. In the mouse, 1) pancRNA-partnered genes confined their expression pattern to certain tissues compared to pancRNA-lacking genes, 2) expression of pancRNAs was significantly correlated with the enrichment of active chromatin marks, H3K4 trimethylation and H3K27 acetylation, at the promoter regions of the partner genes, 3) H3K4me1 marked the pancRNA-partnered genes regardless of their expression level, and 4) C- or G-skewed motifs were exclusively overrepresented between−200 and−1 bp relative to the transcription start sites of the pancRNA-partnered genes. More importantly, the comparative transcriptome analysis among five different mammalian species using a total of 25 counterpart tissues showed that the overall pancRNA expression profile exhibited extremely high species-specificity compared to that of total mRNA, suggesting that interspecies difference in pancRNA repertoires might lead to the diversification of mRNA expression profiles.

Conclusions: The present study raises the interesting possibility that the gain and/or loss of gene-activation-associated pancRNA repertoires, caused by formation or disruption of the genomic GC-skewed structure in the course of evolution, finely shape the tissue-specific pattern of gene expression according to a given species.

Electronic supplementary material: The online version of this article (doi:10.1186/s12864-017-3662-1) contains supplementary material, which is available to authorized users.

No MeSH data available.


Genomic features of promoter sequences of pancRNA-partnered genes. a. The DNA motifs enriched in the immediately upstream regions of TSSs of pancRNA-partnered genes in the mouse genome (−200 bp to −1 bp relative to the TSS). The E-value of each motif is shown. b. Observed frequency of each DNA motif shown in panel A in the regions around TSSs (−500 bp to +500 bp relative to the TSSs) of each group of genes (total protein-coding genes, genes containing parts of other genes within their promoters, pancRNA-partnered genes, and pancRNA-lacking genes) in the five species’ genomes. c. Sequence conservation (phastCons score) for coding sequence regions (CDS), promoter regions of total genes, those of pancRNA-partnered genes, and those of pancRNA-lacking genes. *** P <0.001; Error bars indicate the first and third quartiles
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC5383967&req=5

Fig3: Genomic features of promoter sequences of pancRNA-partnered genes. a. The DNA motifs enriched in the immediately upstream regions of TSSs of pancRNA-partnered genes in the mouse genome (−200 bp to −1 bp relative to the TSS). The E-value of each motif is shown. b. Observed frequency of each DNA motif shown in panel A in the regions around TSSs (−500 bp to +500 bp relative to the TSSs) of each group of genes (total protein-coding genes, genes containing parts of other genes within their promoters, pancRNA-partnered genes, and pancRNA-lacking genes) in the five species’ genomes. c. Sequence conservation (phastCons score) for coding sequence regions (CDS), promoter regions of total genes, those of pancRNA-partnered genes, and those of pancRNA-lacking genes. *** P <0.001; Error bars indicate the first and third quartiles

Mentions: It is possible that the epigenetic characteristics of the pancRNA-partnered genes are further characterized by enrichment of some specific DNA sequences. We and another group previously reported that C-rich or G-rich sequences exist biasedly around the TSS at the immediate upstream regions of the TSSs of pancRNA-partnered genes [23, 27]. In agreement with these reports, we found that the enrichment of CpG islands in the promoter regions of pancRNA-partnered genes (retrieved from the UCSC Genome Browser database) was higher than that in either the category of all protein-coding genes or the category of pancRNA-lacking genes in the five species (Additional file 7: Table S3), and we identified C- and G-skewed motifs, which showed biased enrichment of cytosines and guanines, respectively, in the immediate upstream regions of TSSs (−200 to−1 bp) of pancRNA-partnered genes in the genome of all five species examined here (Fig. 3a, Additional file 8: Figure S5). Analysis of the distribution of these motifs at the regions around TSSs confirmed that the C- and G-skewed motifs were more frequently observed in the immediate upstream regions of TSSs of pancRNA-partnered genes than in those of pancRNA-lacking genes in all of the five species (Fig. 3b). Of these C- and/or G-skewed motif-bearing immediate upstream regions of TSSs of pancRNA-partnered genes, about 16.4% harbored both of these two motifs in all five species (Additional file 9: Table S4). Thus, the presence of either C- or G-skewed motifs in the immediate upstream regions of TSSs is a genomic feature of pancRNA-partnered genes.Fig. 3


Evolutionary acquisition of promoter-associated non-coding RNA (pancRNA) repertoires diversifies species-dependent gene activation mechanisms in mammals
Genomic features of promoter sequences of pancRNA-partnered genes. a. The DNA motifs enriched in the immediately upstream regions of TSSs of pancRNA-partnered genes in the mouse genome (−200 bp to −1 bp relative to the TSS). The E-value of each motif is shown. b. Observed frequency of each DNA motif shown in panel A in the regions around TSSs (−500 bp to +500 bp relative to the TSSs) of each group of genes (total protein-coding genes, genes containing parts of other genes within their promoters, pancRNA-partnered genes, and pancRNA-lacking genes) in the five species’ genomes. c. Sequence conservation (phastCons score) for coding sequence regions (CDS), promoter regions of total genes, those of pancRNA-partnered genes, and those of pancRNA-lacking genes. *** P <0.001; Error bars indicate the first and third quartiles
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC5383967&req=5

Fig3: Genomic features of promoter sequences of pancRNA-partnered genes. a. The DNA motifs enriched in the immediately upstream regions of TSSs of pancRNA-partnered genes in the mouse genome (−200 bp to −1 bp relative to the TSS). The E-value of each motif is shown. b. Observed frequency of each DNA motif shown in panel A in the regions around TSSs (−500 bp to +500 bp relative to the TSSs) of each group of genes (total protein-coding genes, genes containing parts of other genes within their promoters, pancRNA-partnered genes, and pancRNA-lacking genes) in the five species’ genomes. c. Sequence conservation (phastCons score) for coding sequence regions (CDS), promoter regions of total genes, those of pancRNA-partnered genes, and those of pancRNA-lacking genes. *** P <0.001; Error bars indicate the first and third quartiles
Mentions: It is possible that the epigenetic characteristics of the pancRNA-partnered genes are further characterized by enrichment of some specific DNA sequences. We and another group previously reported that C-rich or G-rich sequences exist biasedly around the TSS at the immediate upstream regions of the TSSs of pancRNA-partnered genes [23, 27]. In agreement with these reports, we found that the enrichment of CpG islands in the promoter regions of pancRNA-partnered genes (retrieved from the UCSC Genome Browser database) was higher than that in either the category of all protein-coding genes or the category of pancRNA-lacking genes in the five species (Additional file 7: Table S3), and we identified C- and G-skewed motifs, which showed biased enrichment of cytosines and guanines, respectively, in the immediate upstream regions of TSSs (−200 to−1 bp) of pancRNA-partnered genes in the genome of all five species examined here (Fig. 3a, Additional file 8: Figure S5). Analysis of the distribution of these motifs at the regions around TSSs confirmed that the C- and G-skewed motifs were more frequently observed in the immediate upstream regions of TSSs of pancRNA-partnered genes than in those of pancRNA-lacking genes in all of the five species (Fig. 3b). Of these C- and/or G-skewed motif-bearing immediate upstream regions of TSSs of pancRNA-partnered genes, about 16.4% harbored both of these two motifs in all five species (Additional file 9: Table S4). Thus, the presence of either C- or G-skewed motifs in the immediate upstream regions of TSSs is a genomic feature of pancRNA-partnered genes.Fig. 3

View Article: PubMed Central - PubMed

ABSTRACT

Background: Recent transcriptome analyses have shown that long non-coding RNAs (ncRNAs) play extensive roles in transcriptional regulation. In particular, we have reported that promoter-associated ncRNAs (pancRNAs) activate the partner gene expression via local epigenetic changes.

Results: Here, we identify thousands of genes under pancRNA-mediated transcriptional activation in five mammalian species in common. In the mouse, 1) pancRNA-partnered genes confined their expression pattern to certain tissues compared to pancRNA-lacking genes, 2) expression of pancRNAs was significantly correlated with the enrichment of active chromatin marks, H3K4 trimethylation and H3K27 acetylation, at the promoter regions of the partner genes, 3) H3K4me1 marked the pancRNA-partnered genes regardless of their expression level, and 4) C- or G-skewed motifs were exclusively overrepresented between&minus;200 and&minus;1&nbsp;bp relative to the transcription start sites of the pancRNA-partnered genes. More importantly, the comparative transcriptome analysis among five different mammalian species using a total of 25 counterpart tissues showed that the overall pancRNA expression profile exhibited extremely high species-specificity compared to that of total mRNA, suggesting that interspecies difference in pancRNA repertoires might lead to the diversification of mRNA expression profiles.

Conclusions: The present study raises the interesting possibility that the gain and/or loss of gene-activation-associated pancRNA repertoires, caused by formation or disruption of the genomic GC-skewed structure in the course of evolution, finely shape the tissue-specific pattern of gene expression according to a given species.

Electronic supplementary material: The online version of this article (doi:10.1186/s12864-017-3662-1) contains supplementary material, which is available to authorized users.

No MeSH data available.