Limits...
Ensemble Methods for MiRNA Target Prediction from Expression Data.

Le TD, Zhang J, Liu L, Li J - PLoS ONE (2015)

Bottom Line: On the other hand, ensemble methods integrate the results from individual methods and have been proved to outperform each of their individual component methods in theory.The validation results using experimentally confirmed databases show that the results of the ensemble methods complement those obtained by the individual methods and the ensemble methods perform better than the individual methods across different datasets.Further analysis of the results of this ensemble method shows that the ensemble method can obtain more targets which could not be found by any of the single methods, and the discovered targets are more statistically significant and functionally enriched.

View Article: PubMed Central - PubMed

Affiliation: School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, South Australia, Australia.

ABSTRACT

Background: microRNAs (miRNAs) are short regulatory RNAs that are involved in several diseases, including cancers. Identifying miRNA functions is very important in understanding disease mechanisms and determining the efficacy of drugs. An increasing number of computational methods have been developed to explore miRNA functions by inferring the miRNA-mRNA regulatory relationships from data. Each of the methods is developed based on some assumptions and constraints, for instance, assuming linear relationships between variables. For such reasons, computational methods are often subject to the problem of inconsistent performance across different datasets. On the other hand, ensemble methods integrate the results from individual methods and have been proved to outperform each of their individual component methods in theory.

Results: In this paper, we investigate the performance of some ensemble methods over the commonly used miRNA target prediction methods. We apply eight different popular miRNA target prediction methods to three cancer datasets, and compare their performance with the ensemble methods which integrate the results from each combination of the individual methods. The validation results using experimentally confirmed databases show that the results of the ensemble methods complement those obtained by the individual methods and the ensemble methods perform better than the individual methods across different datasets. The ensemble method, Pearson+IDA+Lasso, which combines methods in different approaches, including a correlation method, a causal inference method, and a regression method, is the best performed ensemble method in this study. Further analysis of the results of this ensemble method shows that the ensemble method can obtain more targets which could not be found by any of the single methods, and the discovered targets are more statistically significant and functionally enriched. The source codes, datasets, miRNA target predictions by all methods, and the ground truth for validation are available in the Supplementary materials.

No MeSH data available.


Related in: MedlinePlus

Venn diagram of the number of confirmed miRNA-mRNA interactions for the best ensemble method (Pearson+IDA+Lasso) and each individual method (Pearson, IDA and Lasso).For each miRNA, we extract top 200 target genes ranked by each method for validation.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4482624&req=5

pone.0131627.g003: Venn diagram of the number of confirmed miRNA-mRNA interactions for the best ensemble method (Pearson+IDA+Lasso) and each individual method (Pearson, IDA and Lasso).For each miRNA, we extract top 200 target genes ranked by each method for validation.

Mentions: We hypothesise that the ensemble methods are better than individual methods, because the ensemble methods can produce results which are complementary to the results of different individual methods. To have a closer look at this hypothesis, we extract the results predicted by the best ensemble method (Pearson+IDA+Lasso) and compare those results with the predictions by each individual method (Pearson, IDA and Lasso). The complementary characteristic is shown in most of the miRNA cases, especially there are many cases in which the results of the ensemble method include confirmed interactions that are not all discovered by a single individual method and therefore the ensemble method performs better than the individual methods. Such cases are shown in 4 miRNAs in EMT (miR-1180, miR-141, miR-18a and miR-96), 7 miRNAs in MCC (miR-197, miR-19a, miR-23a, miR-30a, miR-32, miR-98 and miR-9), and 6 miRNAs in BR51 (miR-125b, miR-196a, miR-21*, miR-27a, miR-30a and miR-342-5p). Fig 3 shows the comparison of the methods in terms of the number of confirmed miRNA-mRNA interactions in the top 200 targets of the miRNAs in each dataset (Fig 3(a), 3(b) and 3(c)), and the overall number in all three datasets (Fig 3(d)).


Ensemble Methods for MiRNA Target Prediction from Expression Data.

Le TD, Zhang J, Liu L, Li J - PLoS ONE (2015)

Venn diagram of the number of confirmed miRNA-mRNA interactions for the best ensemble method (Pearson+IDA+Lasso) and each individual method (Pearson, IDA and Lasso).For each miRNA, we extract top 200 target genes ranked by each method for validation.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4482624&req=5

pone.0131627.g003: Venn diagram of the number of confirmed miRNA-mRNA interactions for the best ensemble method (Pearson+IDA+Lasso) and each individual method (Pearson, IDA and Lasso).For each miRNA, we extract top 200 target genes ranked by each method for validation.
Mentions: We hypothesise that the ensemble methods are better than individual methods, because the ensemble methods can produce results which are complementary to the results of different individual methods. To have a closer look at this hypothesis, we extract the results predicted by the best ensemble method (Pearson+IDA+Lasso) and compare those results with the predictions by each individual method (Pearson, IDA and Lasso). The complementary characteristic is shown in most of the miRNA cases, especially there are many cases in which the results of the ensemble method include confirmed interactions that are not all discovered by a single individual method and therefore the ensemble method performs better than the individual methods. Such cases are shown in 4 miRNAs in EMT (miR-1180, miR-141, miR-18a and miR-96), 7 miRNAs in MCC (miR-197, miR-19a, miR-23a, miR-30a, miR-32, miR-98 and miR-9), and 6 miRNAs in BR51 (miR-125b, miR-196a, miR-21*, miR-27a, miR-30a and miR-342-5p). Fig 3 shows the comparison of the methods in terms of the number of confirmed miRNA-mRNA interactions in the top 200 targets of the miRNAs in each dataset (Fig 3(a), 3(b) and 3(c)), and the overall number in all three datasets (Fig 3(d)).

Bottom Line: On the other hand, ensemble methods integrate the results from individual methods and have been proved to outperform each of their individual component methods in theory.The validation results using experimentally confirmed databases show that the results of the ensemble methods complement those obtained by the individual methods and the ensemble methods perform better than the individual methods across different datasets.Further analysis of the results of this ensemble method shows that the ensemble method can obtain more targets which could not be found by any of the single methods, and the discovered targets are more statistically significant and functionally enriched.

View Article: PubMed Central - PubMed

Affiliation: School of Information Technology and Mathematical Sciences, University of South Australia, Adelaide, South Australia, Australia.

ABSTRACT

Background: microRNAs (miRNAs) are short regulatory RNAs that are involved in several diseases, including cancers. Identifying miRNA functions is very important in understanding disease mechanisms and determining the efficacy of drugs. An increasing number of computational methods have been developed to explore miRNA functions by inferring the miRNA-mRNA regulatory relationships from data. Each of the methods is developed based on some assumptions and constraints, for instance, assuming linear relationships between variables. For such reasons, computational methods are often subject to the problem of inconsistent performance across different datasets. On the other hand, ensemble methods integrate the results from individual methods and have been proved to outperform each of their individual component methods in theory.

Results: In this paper, we investigate the performance of some ensemble methods over the commonly used miRNA target prediction methods. We apply eight different popular miRNA target prediction methods to three cancer datasets, and compare their performance with the ensemble methods which integrate the results from each combination of the individual methods. The validation results using experimentally confirmed databases show that the results of the ensemble methods complement those obtained by the individual methods and the ensemble methods perform better than the individual methods across different datasets. The ensemble method, Pearson+IDA+Lasso, which combines methods in different approaches, including a correlation method, a causal inference method, and a regression method, is the best performed ensemble method in this study. Further analysis of the results of this ensemble method shows that the ensemble method can obtain more targets which could not be found by any of the single methods, and the discovered targets are more statistically significant and functionally enriched. The source codes, datasets, miRNA target predictions by all methods, and the ground truth for validation are available in the Supplementary materials.

No MeSH data available.


Related in: MedlinePlus