Limits...
Differential and coherent processing patterns from small RNAs.

Pundhir S, Gorodkin J - Sci Rep (2015)

Bottom Line: While the annotated loci predominantly consist of ~24 nt short RNAs, the unannotated loci comparatively consist of ~17 nt short RNAs.Furthermore, these ~17 nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs.We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing.

View Article: PubMed Central - PubMed

Affiliation: Center for non-coding RNA in Technology and Health, IKVH, University of Copenhagen, Grønnegårdsvej 3, 1870, Frederiksberg C, Denmark.

ABSTRACT
Post-transcriptional processing events related to short RNAs are often reflected in their read profile patterns emerging from high-throughput sequencing data. MicroRNA arm switching across different tissues is a well-known example of what we define as differential processing. Here, short RNAs from the nine cell lines of the ENCODE project, irrespective of their annotation status, were analyzed for genomic loci representing differential or coherent processing. We observed differential processing predominantly in RNAs annotated as miRNA, snoRNA or tRNA. Four out of five known cases of differentially processed miRNAs that were in the input dataset were recovered and several novel cases were discovered. In contrast to differential processing, coherent processing is observed widespread in both annotated and unannotated regions. While the annotated loci predominantly consist of ~24 nt short RNAs, the unannotated loci comparatively consist of ~17 nt short RNAs. Furthermore, these ~17 nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs. We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing.

No MeSH data available.


Related in: MedlinePlus

Representative examples of coherently processed loci (CPL).UCSC genome browser view showing the genomic location of four CPL (A–D). All the CPL (marked in red) are located at active promoter or enhancer regions (highlighted in green) as supported by an enrichment of H3K4me3 or H3K4me1 histone modification (signal height), enrichment of DNAase I hypersensitivity clusters (black bars) and presence of POL2 binding site (black bars). Also shown are the read profiles corresponding to four CPL (inset) organized in nine rows (cell lines) and two columns (replicates). As evident the read profiles are similar across both replicates of nine cell lines.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4499813&req=5

f5: Representative examples of coherently processed loci (CPL).UCSC genome browser view showing the genomic location of four CPL (A–D). All the CPL (marked in red) are located at active promoter or enhancer regions (highlighted in green) as supported by an enrichment of H3K4me3 or H3K4me1 histone modification (signal height), enrichment of DNAase I hypersensitivity clusters (black bars) and presence of POL2 binding site (black bars). Also shown are the read profiles corresponding to four CPL (inset) organized in nine rows (cell lines) and two columns (replicates). As evident the read profiles are similar across both replicates of nine cell lines.

Mentions: Next, we analyzed the enrichment of 195 CPL for short and long ncRNAs generated due to the bidirectional nature of the transcription initiation at active TSS and enhancer regions535455565758. These ncRNAs include small (<22 nt) ncRNAs, synonymously termed as transcription initiation RNA (tiRNAs) or transcription start site-associated RNAs (TSSa-RNAs)535457 and long ncRNAs, such as promoter-upstream transcripts (PROMPTs) or long non-coding RNAs (lncRNAs)5758 generated from the bidirectional TSS. The tiRNAs or TSSa-RNAs, in particular, are derived from nascent RNAs protected by stalled RNAPII against nucleolysis57. To determine the possible association of 195 unannotated CPL with tiRNAs, we compared the length of reads from these CPL with those mapped to 158 annotated CPL and observed two completely distinct distributions. While most of the reads from annotated CPL were >22 nt with a modal length of 24 nt, most reads from unannotated CPL were <22 nt having a modal length of 17 nt (Fig. 4B), thus agreeing with the tiRNAs in terms of the read length. Next, we analyzed the location of the unannotated CPL with respect to the TSS and observed 41 CPL to be located within a window of 1000 nt upstream and downstream to the TSS (Fig. 4C). The TSS are determined using the 5′ end of the gene annotations available from the GENCODE project59. Specifically, we divided the 2000 nt sized window around TSS into 100 equally sized bins of 20 nt each and computed the percentage overlap of a CPL at each bin. The percentage overlap of each bin is then averaged over all the 41 CPL. In agreement with Taft et al.54, we observed CPL in the sense direction peaking at the 50 nt downstream to the TSS. Furthermore, similar to anti-sense tiRNAs reflecting bidirectional promoters, many CPL were also located upstream to the TSS in anti-sense direction. However, unlike tiRNAs, we also observed CPL located up to 650 nt upstream to the TSS in sense direction and 100 nt downstream to the TSS in anti-sense direction (Fig. 4C). Figure 5 shows four representative examples of CPL located at or in proximity to TSS or enhancer regions. The TSS or enhancer regions are supported by an enrichment of H3K4me3 or H3K4me1 histone modification, enrichment of DNAase I hypersensitivity clusters and presence of POL2 binding site.


Differential and coherent processing patterns from small RNAs.

Pundhir S, Gorodkin J - Sci Rep (2015)

Representative examples of coherently processed loci (CPL).UCSC genome browser view showing the genomic location of four CPL (A–D). All the CPL (marked in red) are located at active promoter or enhancer regions (highlighted in green) as supported by an enrichment of H3K4me3 or H3K4me1 histone modification (signal height), enrichment of DNAase I hypersensitivity clusters (black bars) and presence of POL2 binding site (black bars). Also shown are the read profiles corresponding to four CPL (inset) organized in nine rows (cell lines) and two columns (replicates). As evident the read profiles are similar across both replicates of nine cell lines.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4499813&req=5

f5: Representative examples of coherently processed loci (CPL).UCSC genome browser view showing the genomic location of four CPL (A–D). All the CPL (marked in red) are located at active promoter or enhancer regions (highlighted in green) as supported by an enrichment of H3K4me3 or H3K4me1 histone modification (signal height), enrichment of DNAase I hypersensitivity clusters (black bars) and presence of POL2 binding site (black bars). Also shown are the read profiles corresponding to four CPL (inset) organized in nine rows (cell lines) and two columns (replicates). As evident the read profiles are similar across both replicates of nine cell lines.
Mentions: Next, we analyzed the enrichment of 195 CPL for short and long ncRNAs generated due to the bidirectional nature of the transcription initiation at active TSS and enhancer regions535455565758. These ncRNAs include small (<22 nt) ncRNAs, synonymously termed as transcription initiation RNA (tiRNAs) or transcription start site-associated RNAs (TSSa-RNAs)535457 and long ncRNAs, such as promoter-upstream transcripts (PROMPTs) or long non-coding RNAs (lncRNAs)5758 generated from the bidirectional TSS. The tiRNAs or TSSa-RNAs, in particular, are derived from nascent RNAs protected by stalled RNAPII against nucleolysis57. To determine the possible association of 195 unannotated CPL with tiRNAs, we compared the length of reads from these CPL with those mapped to 158 annotated CPL and observed two completely distinct distributions. While most of the reads from annotated CPL were >22 nt with a modal length of 24 nt, most reads from unannotated CPL were <22 nt having a modal length of 17 nt (Fig. 4B), thus agreeing with the tiRNAs in terms of the read length. Next, we analyzed the location of the unannotated CPL with respect to the TSS and observed 41 CPL to be located within a window of 1000 nt upstream and downstream to the TSS (Fig. 4C). The TSS are determined using the 5′ end of the gene annotations available from the GENCODE project59. Specifically, we divided the 2000 nt sized window around TSS into 100 equally sized bins of 20 nt each and computed the percentage overlap of a CPL at each bin. The percentage overlap of each bin is then averaged over all the 41 CPL. In agreement with Taft et al.54, we observed CPL in the sense direction peaking at the 50 nt downstream to the TSS. Furthermore, similar to anti-sense tiRNAs reflecting bidirectional promoters, many CPL were also located upstream to the TSS in anti-sense direction. However, unlike tiRNAs, we also observed CPL located up to 650 nt upstream to the TSS in sense direction and 100 nt downstream to the TSS in anti-sense direction (Fig. 4C). Figure 5 shows four representative examples of CPL located at or in proximity to TSS or enhancer regions. The TSS or enhancer regions are supported by an enrichment of H3K4me3 or H3K4me1 histone modification, enrichment of DNAase I hypersensitivity clusters and presence of POL2 binding site.

Bottom Line: While the annotated loci predominantly consist of ~24 nt short RNAs, the unannotated loci comparatively consist of ~17 nt short RNAs.Furthermore, these ~17 nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs.We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing.

View Article: PubMed Central - PubMed

Affiliation: Center for non-coding RNA in Technology and Health, IKVH, University of Copenhagen, Grønnegårdsvej 3, 1870, Frederiksberg C, Denmark.

ABSTRACT
Post-transcriptional processing events related to short RNAs are often reflected in their read profile patterns emerging from high-throughput sequencing data. MicroRNA arm switching across different tissues is a well-known example of what we define as differential processing. Here, short RNAs from the nine cell lines of the ENCODE project, irrespective of their annotation status, were analyzed for genomic loci representing differential or coherent processing. We observed differential processing predominantly in RNAs annotated as miRNA, snoRNA or tRNA. Four out of five known cases of differentially processed miRNAs that were in the input dataset were recovered and several novel cases were discovered. In contrast to differential processing, coherent processing is observed widespread in both annotated and unannotated regions. While the annotated loci predominantly consist of ~24 nt short RNAs, the unannotated loci comparatively consist of ~17 nt short RNAs. Furthermore, these ~17 nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs. We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing.

No MeSH data available.


Related in: MedlinePlus