Limits...
Analysis of Nearly One Thousand Mammalian Mirtrons Reveals Novel Features of Dicer Substrates.

Wen J, Ladewig E, Shenker S, Mohammed J, Lai EC - PLoS Comput. Biol. (2015)

Bottom Line: While mirtrons generate miRNA-class regulatory RNAs, we also find that mirtrons exhibit many features that distinguish them from canonical miRNAs.We observe that conventional mirtron hairpins are substantially longer than Drosha-generated pre-miRNAs, indicating that the characteristic length of canonical pre-miRNAs is not a general feature of Dicer substrate hairpins.In addition, mammalian mirtrons exhibit unique patterns of ordered 5' and 3' heterogeneity, which reveal hidden complexity in miRNA processing pathways.

View Article: PubMed Central - PubMed

Affiliation: Department of Developmental Biology, Sloan-Kettering Institute, New York, New York, United States of America.

ABSTRACT
Mirtrons are microRNA (miRNA) substrates that utilize the splicing machinery to bypass the necessity of Drosha cleavage for their biogenesis. Expanding our recent efforts for mammalian mirtron annotation, we use meta-analysis of aggregate datasets to identify ~500 novel mouse and human introns that confidently generate diced small RNA duplexes. These comprise nearly 1000 total loci distributed in four splicing-mediated biogenesis subclasses, with 5'-tailed mirtrons as, by far, the dominant subtype. Thus, mirtrons surprisingly comprise a substantial fraction of endogenous Dicer substrates in mammalian genomes. Although mirtron-derived small RNAs exhibit overall expression correlation with their host mRNAs, we observe a subset with substantial differences that suggest regulated processing or accumulation. We identify characteristic sequence, length, and structural features of mirtron loci that distinguish them from bulk introns, and find that mirtrons preferentially emerge from genes with larger numbers of introns. While mirtrons generate miRNA-class regulatory RNAs, we also find that mirtrons exhibit many features that distinguish them from canonical miRNAs. We observe that conventional mirtron hairpins are substantially longer than Drosha-generated pre-miRNAs, indicating that the characteristic length of canonical pre-miRNAs is not a general feature of Dicer substrate hairpins. In addition, mammalian mirtrons exhibit unique patterns of ordered 5' and 3' heterogeneity, which reveal hidden complexity in miRNA processing pathways. These include broad 3'-uridylation of mirtron hairpins, atypically heterogeneous 5' termini that may result from exonucleolytic processing, and occasionally robust decapitation of the 5' guanine (G) of mirtron-5p species defined by splicing. Altogether, this study reveals that this extensive class of non-canonical miRNA bears a multitude of characteristic properties, many of which raise general mechanistic questions regarding the processing of endogenous hairpin transcripts.

No MeSH data available.


Related in: MedlinePlus

Correlation of mirtron and host gene expression.(A) We calculated the Pearson correlation coefficients of the accumulation of mouse mirtron-derived small RNAs and spliced RNA-seq reads directly flanking the mirtron across seven tissues. We also performed 100 control comparisons where the tissue origins were shuffled. The cumulative distribution function (CDF) of these correlations was plotted, and observed to be significantly positively correlated (by Mann-Whitney U-test). (B) The binned distribution of mirtron/mRNA Pearson correlation coefficients was plotted. This visualization emphasizes their positive correlation, but also highlights a subset of discordant loci. (C) Examples of correlated and discordant expression of mirtron-derived miRNAs and host mRNAs across tissues. We show host level gene expression as reads per kilobase of transcript per million mapped reads (RPKM) and the spliced exonic reads that directly cross the mirtronic locus as reads per million mapped reads (RPM). Mirtron-derived miRNAs are quantified as reads per million mapped miRNA reads (RPMM).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4556696&req=5

pcbi.1004441.g003: Correlation of mirtron and host gene expression.(A) We calculated the Pearson correlation coefficients of the accumulation of mouse mirtron-derived small RNAs and spliced RNA-seq reads directly flanking the mirtron across seven tissues. We also performed 100 control comparisons where the tissue origins were shuffled. The cumulative distribution function (CDF) of these correlations was plotted, and observed to be significantly positively correlated (by Mann-Whitney U-test). (B) The binned distribution of mirtron/mRNA Pearson correlation coefficients was plotted. This visualization emphasizes their positive correlation, but also highlights a subset of discordant loci. (C) Examples of correlated and discordant expression of mirtron-derived miRNAs and host mRNAs across tissues. We show host level gene expression as reads per kilobase of transcript per million mapped reads (RPKM) and the spliced exonic reads that directly cross the mirtronic locus as reads per million mapped reads (RPM). Mirtron-derived miRNAs are quantified as reads per million mapped miRNA reads (RPMM).

Mentions: One way the true correlation might be underestimated in the above analysis, utilizing the mRNA expression of all exons, would be if mirtrons were generated from specific mRNA isoforms. We attempted to remedy this by performing correlation analysis only with spliced mRNA reads that span exon-exon junctions across mirtrons. This analysis generated a positive correlation in the human data, but it was less significant than with the gene level analysis. We can easily rationalize this, however, due to undersampling, since we observed some human loci with no spliced RNA-seq reads across a given mirtron locus (S5 Fig). However, the available mouse RNA-seq data (~3.8 billion from 7 tissues; ~550 million reads/tissue), were much deeper than the human data (370 million reads from 6 tissues; ~61 million reads/tissue). Indeed, the superior depth of the mouse data for spliced reads across mirtrons substantially improved the correlation of host gene-mirtron expression. As shown in the CDF plot (Fig 3A), there is a strong bias for well-correlated pairs (p<9.97E-12).


Analysis of Nearly One Thousand Mammalian Mirtrons Reveals Novel Features of Dicer Substrates.

Wen J, Ladewig E, Shenker S, Mohammed J, Lai EC - PLoS Comput. Biol. (2015)

Correlation of mirtron and host gene expression.(A) We calculated the Pearson correlation coefficients of the accumulation of mouse mirtron-derived small RNAs and spliced RNA-seq reads directly flanking the mirtron across seven tissues. We also performed 100 control comparisons where the tissue origins were shuffled. The cumulative distribution function (CDF) of these correlations was plotted, and observed to be significantly positively correlated (by Mann-Whitney U-test). (B) The binned distribution of mirtron/mRNA Pearson correlation coefficients was plotted. This visualization emphasizes their positive correlation, but also highlights a subset of discordant loci. (C) Examples of correlated and discordant expression of mirtron-derived miRNAs and host mRNAs across tissues. We show host level gene expression as reads per kilobase of transcript per million mapped reads (RPKM) and the spliced exonic reads that directly cross the mirtronic locus as reads per million mapped reads (RPM). Mirtron-derived miRNAs are quantified as reads per million mapped miRNA reads (RPMM).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4556696&req=5

pcbi.1004441.g003: Correlation of mirtron and host gene expression.(A) We calculated the Pearson correlation coefficients of the accumulation of mouse mirtron-derived small RNAs and spliced RNA-seq reads directly flanking the mirtron across seven tissues. We also performed 100 control comparisons where the tissue origins were shuffled. The cumulative distribution function (CDF) of these correlations was plotted, and observed to be significantly positively correlated (by Mann-Whitney U-test). (B) The binned distribution of mirtron/mRNA Pearson correlation coefficients was plotted. This visualization emphasizes their positive correlation, but also highlights a subset of discordant loci. (C) Examples of correlated and discordant expression of mirtron-derived miRNAs and host mRNAs across tissues. We show host level gene expression as reads per kilobase of transcript per million mapped reads (RPKM) and the spliced exonic reads that directly cross the mirtronic locus as reads per million mapped reads (RPM). Mirtron-derived miRNAs are quantified as reads per million mapped miRNA reads (RPMM).
Mentions: One way the true correlation might be underestimated in the above analysis, utilizing the mRNA expression of all exons, would be if mirtrons were generated from specific mRNA isoforms. We attempted to remedy this by performing correlation analysis only with spliced mRNA reads that span exon-exon junctions across mirtrons. This analysis generated a positive correlation in the human data, but it was less significant than with the gene level analysis. We can easily rationalize this, however, due to undersampling, since we observed some human loci with no spliced RNA-seq reads across a given mirtron locus (S5 Fig). However, the available mouse RNA-seq data (~3.8 billion from 7 tissues; ~550 million reads/tissue), were much deeper than the human data (370 million reads from 6 tissues; ~61 million reads/tissue). Indeed, the superior depth of the mouse data for spliced reads across mirtrons substantially improved the correlation of host gene-mirtron expression. As shown in the CDF plot (Fig 3A), there is a strong bias for well-correlated pairs (p<9.97E-12).

Bottom Line: While mirtrons generate miRNA-class regulatory RNAs, we also find that mirtrons exhibit many features that distinguish them from canonical miRNAs.We observe that conventional mirtron hairpins are substantially longer than Drosha-generated pre-miRNAs, indicating that the characteristic length of canonical pre-miRNAs is not a general feature of Dicer substrate hairpins.In addition, mammalian mirtrons exhibit unique patterns of ordered 5' and 3' heterogeneity, which reveal hidden complexity in miRNA processing pathways.

View Article: PubMed Central - PubMed

Affiliation: Department of Developmental Biology, Sloan-Kettering Institute, New York, New York, United States of America.

ABSTRACT
Mirtrons are microRNA (miRNA) substrates that utilize the splicing machinery to bypass the necessity of Drosha cleavage for their biogenesis. Expanding our recent efforts for mammalian mirtron annotation, we use meta-analysis of aggregate datasets to identify ~500 novel mouse and human introns that confidently generate diced small RNA duplexes. These comprise nearly 1000 total loci distributed in four splicing-mediated biogenesis subclasses, with 5'-tailed mirtrons as, by far, the dominant subtype. Thus, mirtrons surprisingly comprise a substantial fraction of endogenous Dicer substrates in mammalian genomes. Although mirtron-derived small RNAs exhibit overall expression correlation with their host mRNAs, we observe a subset with substantial differences that suggest regulated processing or accumulation. We identify characteristic sequence, length, and structural features of mirtron loci that distinguish them from bulk introns, and find that mirtrons preferentially emerge from genes with larger numbers of introns. While mirtrons generate miRNA-class regulatory RNAs, we also find that mirtrons exhibit many features that distinguish them from canonical miRNAs. We observe that conventional mirtron hairpins are substantially longer than Drosha-generated pre-miRNAs, indicating that the characteristic length of canonical pre-miRNAs is not a general feature of Dicer substrate hairpins. In addition, mammalian mirtrons exhibit unique patterns of ordered 5' and 3' heterogeneity, which reveal hidden complexity in miRNA processing pathways. These include broad 3'-uridylation of mirtron hairpins, atypically heterogeneous 5' termini that may result from exonucleolytic processing, and occasionally robust decapitation of the 5' guanine (G) of mirtron-5p species defined by splicing. Altogether, this study reveals that this extensive class of non-canonical miRNA bears a multitude of characteristic properties, many of which raise general mechanistic questions regarding the processing of endogenous hairpin transcripts.

No MeSH data available.


Related in: MedlinePlus