Integrative random forest for gene regulatory network inference.
Bottom Line: Gene regulatory network (GRN) inference based on genomic data is one of the most actively pursued computational biological problems.Because different types of biological data usually provide complementary information regarding the underlying GRN, a model that integrates big data of diverse types is expected to increase both the power and accuracy of GRN inference.We apply iRafNet to construct GRN in Saccharomyces cerevisiae and demonstrate that it improves the performance in predicting TF-target gene regulations and provides additional functional insights to the predicted gene regulations.
Affiliation: Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.Show MeSH
Mentions: Similarly to the comparison procedure used by the DREAM5 challenge, for each model, receiver operating characteristic and precision-recall curves were computed considering the top 100 000 regulations. Table 2 compares iRafNet, GENIE3 and COMMUNITY in terms of AUC and AUPR. COMMUNITY is a more generalized ensemble model, which derives a consensus network by combining the results of all 35 teams participating in the challenge (Marbach et al., 2012). The DREAM 5 challenge provides predicted networks for all teams participating in the challenge; based on this information, we compute confidence intervals of the area under the ROC and precision recall curve for all models and include the results in Table 2. Although COMMUNITY outperformed each single team participating in the challenge, iRafNet results in better AUPR than both GENIE 3 and COMMUNITY. Specifically, the AUPR of iRafNet is ∼9% larger than that of COMMUNITY and ∼21% larger than that of GENIE3 for Network 1. For Network 3, the AUPR of iRafNet is ∼11% larger than that of COMMUNITY and GENIE3. The three methods scored similar performance in terms of AUC; however, as shown in Figure 2, iRafNet outperforms the other two methods in the most critical region of the ROC curve characterized by small values of false positive rates.Fig. 2.
Affiliation: Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.