Limits...
Differential and coherent processing patterns from small RNAs.

Pundhir S, Gorodkin J - Sci Rep (2015)

Bottom Line: While the annotated loci predominantly consist of ~24 nt short RNAs, the unannotated loci comparatively consist of ~17 nt short RNAs.Furthermore, these ~17 nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs.We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing.

View Article: PubMed Central - PubMed

Affiliation: Center for non-coding RNA in Technology and Health, IKVH, University of Copenhagen, Grønnegårdsvej 3, 1870, Frederiksberg C, Denmark.

ABSTRACT
Post-transcriptional processing events related to short RNAs are often reflected in their read profile patterns emerging from high-throughput sequencing data. MicroRNA arm switching across different tissues is a well-known example of what we define as differential processing. Here, short RNAs from the nine cell lines of the ENCODE project, irrespective of their annotation status, were analyzed for genomic loci representing differential or coherent processing. We observed differential processing predominantly in RNAs annotated as miRNA, snoRNA or tRNA. Four out of five known cases of differentially processed miRNAs that were in the input dataset were recovered and several novel cases were discovered. In contrast to differential processing, coherent processing is observed widespread in both annotated and unannotated regions. While the annotated loci predominantly consist of ~24 nt short RNAs, the unannotated loci comparatively consist of ~17 nt short RNAs. Furthermore, these ~17 nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs. We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing.

No MeSH data available.


Related in: MedlinePlus

Examples of differentially processed loci (DPL) obtained after the analysis of short total RNA-seq data from nine human cell lines (two biological replicates each).Each panel shows the read profiles (in black) across nine cell lines (both replicates), along with the normalized expression of constituent read blocks (Equation 4). Tree branches corresponding to the distinct clusters of cell lines (p-value < 0.05) are marked with different colors. In all the four examples, we observe two distinct set of read profiles characterized by high expression from one end of the read profile in a sub-set of cell lines and from another end in rest of the cell lines. (A,B) MiRNA read profiles exhibits a typical example of a process called ‘arm switching’ where pre-miRNA switches the arm from where the mature miRNA is processed11. Similar read profiles are observed in both the biological replicates despite much variability in their expression. (C,D) Transposition in the expression from the two ends is also observed for tRNA and snoRNA loci. These and rRNAs have also been shown to produce small 5′ and 3′ end fragments in an asymmetric manner that predominantly favors either the 5′ or 3′ end34. In snoRNA profile, due to high variation between the read profiles from the two biological replicates of skin, we observed inconsistent cluster scores of 0.19 and 0.14 in replicate 1 and 2, respectively. The nine cell lines from ENCODE are abbreviated with the initials of the parent human tissues (Bl: blood, GM12878; Br: brain, SK-N-SH RA; Bt: breast, MCF-7; Ce: cervix, HeLa-S3; Ep: epithelium, A549; Es: embryonic stem cell, H1-hESC; Li: liver, HEPG2; Lu: lung, AG04450; Sk: skin, BJ).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4499813&req=5

f1: Examples of differentially processed loci (DPL) obtained after the analysis of short total RNA-seq data from nine human cell lines (two biological replicates each).Each panel shows the read profiles (in black) across nine cell lines (both replicates), along with the normalized expression of constituent read blocks (Equation 4). Tree branches corresponding to the distinct clusters of cell lines (p-value < 0.05) are marked with different colors. In all the four examples, we observe two distinct set of read profiles characterized by high expression from one end of the read profile in a sub-set of cell lines and from another end in rest of the cell lines. (A,B) MiRNA read profiles exhibits a typical example of a process called ‘arm switching’ where pre-miRNA switches the arm from where the mature miRNA is processed11. Similar read profiles are observed in both the biological replicates despite much variability in their expression. (C,D) Transposition in the expression from the two ends is also observed for tRNA and snoRNA loci. These and rRNAs have also been shown to produce small 5′ and 3′ end fragments in an asymmetric manner that predominantly favors either the 5′ or 3′ end34. In snoRNA profile, due to high variation between the read profiles from the two biological replicates of skin, we observed inconsistent cluster scores of 0.19 and 0.14 in replicate 1 and 2, respectively. The nine cell lines from ENCODE are abbreviated with the initials of the parent human tissues (Bl: blood, GM12878; Br: brain, SK-N-SH RA; Bt: breast, MCF-7; Ce: cervix, HeLa-S3; Ep: epithelium, A549; Es: embryonic stem cell, H1-hESC; Li: liver, HEPG2; Lu: lung, AG04450; Sk: skin, BJ).

Mentions: Out of 701 loci analyzed, differential processing was observed for 97 loci (DPL). Although about one-third of the 701 loci were unannotated (216 out of 701, 31%), only a marginal proportion (three out of 216, 1%) of them were differentially processed (Supplementary Table S2). In contrast, almost all DPL (94 out of 97) were annotated to non-coding RNAs (22 miRNAs, 31 snoRNAs, 30 tRNAs and 11 other ncRNAs). Figure 1 illustrates four examples of DPL observed in the ENCODE dataset. The first two examples are miRNAs namely, hsa-mir-30a and hsa-mir-30e, which have a distinct expression profile between the two arms of pre-miRNA (5′ and 3′ end) across cell lines. While the 5′ end is more expressed in the majority of cell lines, the expression at 3′ end is more pronounced in HEPG2 (liver), H1-hESC (embryonic stem cell) and BJ (skin) for hsa-mir-30a. This is a known example of ‘arm switching’, where pre-miRNA switches the arm from where the mature miRNA is processed1128. Similar arm switching is also observed for hsa-mir-30e, which also has previously been reported using PCR-based method for miRNA quantification28. Intriguingly, both hsa-mir-30a and hsa-mir-30e, which belong to the same miRNA family (mir-30), are differentially processed between similar set of cell lines (HEPG2, H1-hESC and BJ) and have both been shown to be implicated in epithelial-mesenchymal state transition (EMT) in human pancreatic cells29. In total, we identified 22 differentially processed miRNA loci, of which nine are clear cases of ‘arm switching’ (Supplementary Table S3) and the remaining 13 cases exhibit ‘arm loss’. Note that all these cases fulfill the requirement of expression in both replicates of all nine cell lines. Five out of the nine clear cases of ‘arm switching’ have previously been reported, three in human and two in mouse (Supplementary Table S3). From the opposite perspective, a total of 39 cases of ‘arm-switching’ have been reported in human, of which five fulfilled the expression criteria and are thus in our input set. Four of them are recovered, three as arm switched and one as arm loss (Supplementary Table S4). Also, none of the remaining 12 cases of arm-loss correspond to 33 previously known cases of arm switching in mouse3031.


Differential and coherent processing patterns from small RNAs.

Pundhir S, Gorodkin J - Sci Rep (2015)

Examples of differentially processed loci (DPL) obtained after the analysis of short total RNA-seq data from nine human cell lines (two biological replicates each).Each panel shows the read profiles (in black) across nine cell lines (both replicates), along with the normalized expression of constituent read blocks (Equation 4). Tree branches corresponding to the distinct clusters of cell lines (p-value < 0.05) are marked with different colors. In all the four examples, we observe two distinct set of read profiles characterized by high expression from one end of the read profile in a sub-set of cell lines and from another end in rest of the cell lines. (A,B) MiRNA read profiles exhibits a typical example of a process called ‘arm switching’ where pre-miRNA switches the arm from where the mature miRNA is processed11. Similar read profiles are observed in both the biological replicates despite much variability in their expression. (C,D) Transposition in the expression from the two ends is also observed for tRNA and snoRNA loci. These and rRNAs have also been shown to produce small 5′ and 3′ end fragments in an asymmetric manner that predominantly favors either the 5′ or 3′ end34. In snoRNA profile, due to high variation between the read profiles from the two biological replicates of skin, we observed inconsistent cluster scores of 0.19 and 0.14 in replicate 1 and 2, respectively. The nine cell lines from ENCODE are abbreviated with the initials of the parent human tissues (Bl: blood, GM12878; Br: brain, SK-N-SH RA; Bt: breast, MCF-7; Ce: cervix, HeLa-S3; Ep: epithelium, A549; Es: embryonic stem cell, H1-hESC; Li: liver, HEPG2; Lu: lung, AG04450; Sk: skin, BJ).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4499813&req=5

f1: Examples of differentially processed loci (DPL) obtained after the analysis of short total RNA-seq data from nine human cell lines (two biological replicates each).Each panel shows the read profiles (in black) across nine cell lines (both replicates), along with the normalized expression of constituent read blocks (Equation 4). Tree branches corresponding to the distinct clusters of cell lines (p-value < 0.05) are marked with different colors. In all the four examples, we observe two distinct set of read profiles characterized by high expression from one end of the read profile in a sub-set of cell lines and from another end in rest of the cell lines. (A,B) MiRNA read profiles exhibits a typical example of a process called ‘arm switching’ where pre-miRNA switches the arm from where the mature miRNA is processed11. Similar read profiles are observed in both the biological replicates despite much variability in their expression. (C,D) Transposition in the expression from the two ends is also observed for tRNA and snoRNA loci. These and rRNAs have also been shown to produce small 5′ and 3′ end fragments in an asymmetric manner that predominantly favors either the 5′ or 3′ end34. In snoRNA profile, due to high variation between the read profiles from the two biological replicates of skin, we observed inconsistent cluster scores of 0.19 and 0.14 in replicate 1 and 2, respectively. The nine cell lines from ENCODE are abbreviated with the initials of the parent human tissues (Bl: blood, GM12878; Br: brain, SK-N-SH RA; Bt: breast, MCF-7; Ce: cervix, HeLa-S3; Ep: epithelium, A549; Es: embryonic stem cell, H1-hESC; Li: liver, HEPG2; Lu: lung, AG04450; Sk: skin, BJ).
Mentions: Out of 701 loci analyzed, differential processing was observed for 97 loci (DPL). Although about one-third of the 701 loci were unannotated (216 out of 701, 31%), only a marginal proportion (three out of 216, 1%) of them were differentially processed (Supplementary Table S2). In contrast, almost all DPL (94 out of 97) were annotated to non-coding RNAs (22 miRNAs, 31 snoRNAs, 30 tRNAs and 11 other ncRNAs). Figure 1 illustrates four examples of DPL observed in the ENCODE dataset. The first two examples are miRNAs namely, hsa-mir-30a and hsa-mir-30e, which have a distinct expression profile between the two arms of pre-miRNA (5′ and 3′ end) across cell lines. While the 5′ end is more expressed in the majority of cell lines, the expression at 3′ end is more pronounced in HEPG2 (liver), H1-hESC (embryonic stem cell) and BJ (skin) for hsa-mir-30a. This is a known example of ‘arm switching’, where pre-miRNA switches the arm from where the mature miRNA is processed1128. Similar arm switching is also observed for hsa-mir-30e, which also has previously been reported using PCR-based method for miRNA quantification28. Intriguingly, both hsa-mir-30a and hsa-mir-30e, which belong to the same miRNA family (mir-30), are differentially processed between similar set of cell lines (HEPG2, H1-hESC and BJ) and have both been shown to be implicated in epithelial-mesenchymal state transition (EMT) in human pancreatic cells29. In total, we identified 22 differentially processed miRNA loci, of which nine are clear cases of ‘arm switching’ (Supplementary Table S3) and the remaining 13 cases exhibit ‘arm loss’. Note that all these cases fulfill the requirement of expression in both replicates of all nine cell lines. Five out of the nine clear cases of ‘arm switching’ have previously been reported, three in human and two in mouse (Supplementary Table S3). From the opposite perspective, a total of 39 cases of ‘arm-switching’ have been reported in human, of which five fulfilled the expression criteria and are thus in our input set. Four of them are recovered, three as arm switched and one as arm loss (Supplementary Table S4). Also, none of the remaining 12 cases of arm-loss correspond to 33 previously known cases of arm switching in mouse3031.

Bottom Line: While the annotated loci predominantly consist of ~24 nt short RNAs, the unannotated loci comparatively consist of ~17 nt short RNAs.Furthermore, these ~17 nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs.We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing.

View Article: PubMed Central - PubMed

Affiliation: Center for non-coding RNA in Technology and Health, IKVH, University of Copenhagen, Grønnegårdsvej 3, 1870, Frederiksberg C, Denmark.

ABSTRACT
Post-transcriptional processing events related to short RNAs are often reflected in their read profile patterns emerging from high-throughput sequencing data. MicroRNA arm switching across different tissues is a well-known example of what we define as differential processing. Here, short RNAs from the nine cell lines of the ENCODE project, irrespective of their annotation status, were analyzed for genomic loci representing differential or coherent processing. We observed differential processing predominantly in RNAs annotated as miRNA, snoRNA or tRNA. Four out of five known cases of differentially processed miRNAs that were in the input dataset were recovered and several novel cases were discovered. In contrast to differential processing, coherent processing is observed widespread in both annotated and unannotated regions. While the annotated loci predominantly consist of ~24 nt short RNAs, the unannotated loci comparatively consist of ~17 nt short RNAs. Furthermore, these ~17 nt short RNAs are significantly enriched for overlap to transcription start sites and DNase I hypersensitive sites (p-value < 0.01) that are characteristic features of transcription initiation RNAs. We discuss how the computational pipeline developed in this study has the potential to be applied to other forms of RNA-seq data for further transcriptome-wide studies of differential and coherent processing.

No MeSH data available.


Related in: MedlinePlus