Limits...
Prediction of MicroRNA Precursors Using Parsimonious Feature Sets.

Stepanowsky P, Levy E, Kim J, Jiang X, Ohno-Machado L - Cancer Inform (2014)

Bottom Line: However, no study has systematically compared published feature sets.We found that classifiers containing as few as seven highly predictive features are able to predict novel precursor miRNAs as well as classifiers that use larger feature sets.In a real data set, our method correctly identified the holdout miRNAs relevant to renal cancer.

View Article: PubMed Central - PubMed

Affiliation: Bioinformatics Research Group, University of Applied Sciences, Upper Austria, Hagenberg, Austria.

ABSTRACT
MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate gene expression through base pairing with messenger RNAs. Due to the interest in studying miRNA dysregulation in disease and limits of validated miRNA references, identification of novel miRNAs is a critical task. The performance of different models to predict novel miRNAs varies with the features chosen as predictors. However, no study has systematically compared published feature sets. We constructed a comprehensive feature set using the minimum free energy of the secondary structure of precursor miRNAs, a set of nucleotide-structure triplets, and additional extracted sequence and structure characteristics. We then compared the predictive value of our comprehensive feature set to those from three previously published studies, using logistic regression and random forest classifiers. We found that classifiers containing as few as seven highly predictive features are able to predict novel precursor miRNAs as well as classifiers that use larger feature sets. In a real data set, our method correctly identified the holdout miRNAs relevant to renal cancer.

No MeSH data available.


Related in: MedlinePlus

The secondary structure of a pre-miRNA can change if one nucleotide is different. This figure illustrates (A) the pre-miRNA hsa-miR-19b-1 sequence and its secondary structure, (B) a one-nucleotide change (yellow) that modifies the loop part, and (C) the stem arm.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4216048&req=5

f1-cin-suppl.1-2014-095: The secondary structure of a pre-miRNA can change if one nucleotide is different. This figure illustrates (A) the pre-miRNA hsa-miR-19b-1 sequence and its secondary structure, (B) a one-nucleotide change (yellow) that modifies the loop part, and (C) the stem arm.

Mentions: The secondary structures of the pre-miRNAs were predicted using the Vienna RNAfold pack age.23,27 The structures are represented in bracket notation, which contains only two statuses for a nucleotide: paired and unpaired. Open and closed parentheses, “(“and”)”, are used for a nucleotide pair between a nucleotide on the 5’ end and a nucleotide on the 3’ end, respectively. Dots represent unpaired nucleotides. We did not distinguish between a nucleotide on the 5’ or 3’ end in this study, so we used “(“ for both cases. A typical secondary structure of a pre-miRNA contains a stem and a single loop as displayed in Figure 1-a. Pre-miRNAs containing multiple loops in their secondary structures were not considered. The bracket notation gives information about the number of paired and unpaired nucleotides, as well as the ratio between them. The number of nucleotides on the 5’ and 3’ stem arm can be different due to bulges caused by unpaired nucleotides. This fact was used to normalize the number of nucleotide pairs by the longer stem arm. The loop part is normalized using the length of the pre-miRNA hairpin.


Prediction of MicroRNA Precursors Using Parsimonious Feature Sets.

Stepanowsky P, Levy E, Kim J, Jiang X, Ohno-Machado L - Cancer Inform (2014)

The secondary structure of a pre-miRNA can change if one nucleotide is different. This figure illustrates (A) the pre-miRNA hsa-miR-19b-1 sequence and its secondary structure, (B) a one-nucleotide change (yellow) that modifies the loop part, and (C) the stem arm.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4216048&req=5

f1-cin-suppl.1-2014-095: The secondary structure of a pre-miRNA can change if one nucleotide is different. This figure illustrates (A) the pre-miRNA hsa-miR-19b-1 sequence and its secondary structure, (B) a one-nucleotide change (yellow) that modifies the loop part, and (C) the stem arm.
Mentions: The secondary structures of the pre-miRNAs were predicted using the Vienna RNAfold pack age.23,27 The structures are represented in bracket notation, which contains only two statuses for a nucleotide: paired and unpaired. Open and closed parentheses, “(“and”)”, are used for a nucleotide pair between a nucleotide on the 5’ end and a nucleotide on the 3’ end, respectively. Dots represent unpaired nucleotides. We did not distinguish between a nucleotide on the 5’ or 3’ end in this study, so we used “(“ for both cases. A typical secondary structure of a pre-miRNA contains a stem and a single loop as displayed in Figure 1-a. Pre-miRNAs containing multiple loops in their secondary structures were not considered. The bracket notation gives information about the number of paired and unpaired nucleotides, as well as the ratio between them. The number of nucleotides on the 5’ and 3’ stem arm can be different due to bulges caused by unpaired nucleotides. This fact was used to normalize the number of nucleotide pairs by the longer stem arm. The loop part is normalized using the length of the pre-miRNA hairpin.

Bottom Line: However, no study has systematically compared published feature sets.We found that classifiers containing as few as seven highly predictive features are able to predict novel precursor miRNAs as well as classifiers that use larger feature sets.In a real data set, our method correctly identified the holdout miRNAs relevant to renal cancer.

View Article: PubMed Central - PubMed

Affiliation: Bioinformatics Research Group, University of Applied Sciences, Upper Austria, Hagenberg, Austria.

ABSTRACT
MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate gene expression through base pairing with messenger RNAs. Due to the interest in studying miRNA dysregulation in disease and limits of validated miRNA references, identification of novel miRNAs is a critical task. The performance of different models to predict novel miRNAs varies with the features chosen as predictors. However, no study has systematically compared published feature sets. We constructed a comprehensive feature set using the minimum free energy of the secondary structure of precursor miRNAs, a set of nucleotide-structure triplets, and additional extracted sequence and structure characteristics. We then compared the predictive value of our comprehensive feature set to those from three previously published studies, using logistic regression and random forest classifiers. We found that classifiers containing as few as seven highly predictive features are able to predict novel precursor miRNAs as well as classifiers that use larger feature sets. In a real data set, our method correctly identified the holdout miRNAs relevant to renal cancer.

No MeSH data available.


Related in: MedlinePlus