Limits...
Computational prediction of the localization of microRNAs within their pre-miRNA.

Leclercq M, Diallo AB, Blanchette M - Nucleic Acids Res. (2013)

Bottom Line: Because we observed that miRNAs have sequence and structural properties that differ between species, mostly in terms of duplex stability, we trained various clade-specific miRdup models and obtained increased accuracy.MiRdup self-trains on the most recent version of miRbase and is easy to use.Combined with existing pre-miRNA predictors, it will be valuable for both de novo mapping of miRNAs and filtering of large sets of candidate miRNAs obtained from transcriptome sequencing projects.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science and McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada H3A2B2 and Laboratoire de bioinformatique du département informatique, Université du Québec À Montréal, Montreal, Quebec, Canada H2X3Y7.

ABSTRACT
MicroRNAs (miRNAs) are short RNA species derived from hairpin-forming miRNA precursors (pre-miRNA) and acting as key posttranscriptional regulators. Most computational tools labeled as miRNA predictors are in fact pre-miRNA predictors and provide no information about the putative miRNA location within the pre-miRNA. Sequence and structural features that determine the location of the miRNA, and the extent to which these properties vary from species to species, are poorly understood. We have developed miRdup, a computational predictor for the identification of the most likely miRNA location within a given pre-miRNA or the validation of a candidate miRNA. MiRdup is based on a random forest classifier trained with experimentally validated miRNAs from miRbase, with features that characterize the miRNA-miRNA* duplex. Because we observed that miRNAs have sequence and structural properties that differ between species, mostly in terms of duplex stability, we trained various clade-specific miRdup models and obtained increased accuracy. MiRdup self-trains on the most recent version of miRbase and is easy to use. Combined with existing pre-miRNA predictors, it will be valuable for both de novo mapping of miRNAs and filtering of large sets of candidate miRNAs obtained from transcriptome sequencing projects. MiRdup is open source under the GPLv3 and available at http://www.cs.mcgill.ca/∼blanchem/mirdup/.

Show MeSH
Properties of miRNAs from six different lineages: all eukaryotes (19 823 miRNAs), mammals (6959), fish (766), nematodes (1087), arthropods (2620) and plants (4732). Each panel shows the distribution of a selected feature. (A) MiRNA length (nt). (B) MFE of the miRNA–miRNA* duplex (kcal/mol). (C) Length of the largest bulge in the miRNA (nt). (D) Number of bulges in the miRNA–miRNA* duplex. (E) Length of longest bulge-free stem in the miRNA–miRNA* duplex. (F) Start position of the first 10 nt bulge-free stem in the miRNA–miRNA* duplex; −1 means no such region is present. (G) Distance to the terminal loop of the hairpin (nt). (H) miRNA GC-content. (I) Nucleotide type (A, U, G or C) at the first position of the miRNA.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3753617&req=5

gkt466-F2: Properties of miRNAs from six different lineages: all eukaryotes (19 823 miRNAs), mammals (6959), fish (766), nematodes (1087), arthropods (2620) and plants (4732). Each panel shows the distribution of a selected feature. (A) MiRNA length (nt). (B) MFE of the miRNA–miRNA* duplex (kcal/mol). (C) Length of the largest bulge in the miRNA (nt). (D) Number of bulges in the miRNA–miRNA* duplex. (E) Length of longest bulge-free stem in the miRNA–miRNA* duplex. (F) Start position of the first 10 nt bulge-free stem in the miRNA–miRNA* duplex; −1 means no such region is present. (G) Distance to the terminal loop of the hairpin (nt). (H) miRNA GC-content. (I) Nucleotide type (A, U, G or C) at the first position of the miRNA.


Computational prediction of the localization of microRNAs within their pre-miRNA.

Leclercq M, Diallo AB, Blanchette M - Nucleic Acids Res. (2013)

Properties of miRNAs from six different lineages: all eukaryotes (19 823 miRNAs), mammals (6959), fish (766), nematodes (1087), arthropods (2620) and plants (4732). Each panel shows the distribution of a selected feature. (A) MiRNA length (nt). (B) MFE of the miRNA–miRNA* duplex (kcal/mol). (C) Length of the largest bulge in the miRNA (nt). (D) Number of bulges in the miRNA–miRNA* duplex. (E) Length of longest bulge-free stem in the miRNA–miRNA* duplex. (F) Start position of the first 10 nt bulge-free stem in the miRNA–miRNA* duplex; −1 means no such region is present. (G) Distance to the terminal loop of the hairpin (nt). (H) miRNA GC-content. (I) Nucleotide type (A, U, G or C) at the first position of the miRNA.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3753617&req=5

gkt466-F2: Properties of miRNAs from six different lineages: all eukaryotes (19 823 miRNAs), mammals (6959), fish (766), nematodes (1087), arthropods (2620) and plants (4732). Each panel shows the distribution of a selected feature. (A) MiRNA length (nt). (B) MFE of the miRNA–miRNA* duplex (kcal/mol). (C) Length of the largest bulge in the miRNA (nt). (D) Number of bulges in the miRNA–miRNA* duplex. (E) Length of longest bulge-free stem in the miRNA–miRNA* duplex. (F) Start position of the first 10 nt bulge-free stem in the miRNA–miRNA* duplex; −1 means no such region is present. (G) Distance to the terminal loop of the hairpin (nt). (H) miRNA GC-content. (I) Nucleotide type (A, U, G or C) at the first position of the miRNA.
Bottom Line: Because we observed that miRNAs have sequence and structural properties that differ between species, mostly in terms of duplex stability, we trained various clade-specific miRdup models and obtained increased accuracy.MiRdup self-trains on the most recent version of miRbase and is easy to use.Combined with existing pre-miRNA predictors, it will be valuable for both de novo mapping of miRNAs and filtering of large sets of candidate miRNAs obtained from transcriptome sequencing projects.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science and McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada H3A2B2 and Laboratoire de bioinformatique du département informatique, Université du Québec À Montréal, Montreal, Quebec, Canada H2X3Y7.

ABSTRACT
MicroRNAs (miRNAs) are short RNA species derived from hairpin-forming miRNA precursors (pre-miRNA) and acting as key posttranscriptional regulators. Most computational tools labeled as miRNA predictors are in fact pre-miRNA predictors and provide no information about the putative miRNA location within the pre-miRNA. Sequence and structural features that determine the location of the miRNA, and the extent to which these properties vary from species to species, are poorly understood. We have developed miRdup, a computational predictor for the identification of the most likely miRNA location within a given pre-miRNA or the validation of a candidate miRNA. MiRdup is based on a random forest classifier trained with experimentally validated miRNAs from miRbase, with features that characterize the miRNA-miRNA* duplex. Because we observed that miRNAs have sequence and structural properties that differ between species, mostly in terms of duplex stability, we trained various clade-specific miRdup models and obtained increased accuracy. MiRdup self-trains on the most recent version of miRbase and is easy to use. Combined with existing pre-miRNA predictors, it will be valuable for both de novo mapping of miRNAs and filtering of large sets of candidate miRNAs obtained from transcriptome sequencing projects. MiRdup is open source under the GPLv3 and available at http://www.cs.mcgill.ca/∼blanchem/mirdup/.

Show MeSH