Limits...
Integrative random forest for gene regulatory network inference.

Petralia F, Wang P, Yang J, Tu Z - Bioinformatics (2015)

Bottom Line: Gene regulatory network (GRN) inference based on genomic data is one of the most actively pursued computational biological problems.Because different types of biological data usually provide complementary information regarding the underlying GRN, a model that integrates big data of diverse types is expected to increase both the power and accuracy of GRN inference.We apply iRafNet to construct GRN in Saccharomyces cerevisiae and demonstrate that it improves the performance in predicting TF-target gene regulations and provides additional functional insights to the predicted gene regulations.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Show MeSH
iRafNet schematics. For each gene , we determine a ranked list of potential regulators via iRafNet. Based on each data , we derive weights  measuring the prior belief of regulatory relationships . Using expression data, we run random forest to find genes regulating gj. At each node, instead of sampling a random subset of genes from the entire set of genes; we randomly choose an integer  and we sample genes according to weights . The final network is derived by ranking potential regulators based on the random forest importance score
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4542785&req=5

btv268-F1: iRafNet schematics. For each gene , we determine a ranked list of potential regulators via iRafNet. Based on each data , we derive weights measuring the prior belief of regulatory relationships . Using expression data, we run random forest to find genes regulating gj. At each node, instead of sampling a random subset of genes from the entire set of genes; we randomly choose an integer and we sample genes according to weights . The final network is derived by ranking potential regulators based on the random forest importance score

Mentions: In this article, we introduce a weighted sampling scheme under the framework of random forest to allow the integration of heterogeneous data types. As shown in Figure 1, first, iRafNet processes supporting data to derive the prior belief of regulatory relationships among genes, then, it integrates such prior information to the main dataset via random forest to construct the final GRN. We consider different genomic data including gene expression data from steady-state experiments, time-series experiments, knockout experiments and other biological data such as protein–protein interactions. As shown in Figure 1, one data source is considered as main input data for random forest inference while other D datasets (supporting data) are utilized to derive prior information. iRafNet can be summarized in the following major steps, and detailed information regarding each step is provided in later sections:


Integrative random forest for gene regulatory network inference.

Petralia F, Wang P, Yang J, Tu Z - Bioinformatics (2015)

iRafNet schematics. For each gene , we determine a ranked list of potential regulators via iRafNet. Based on each data , we derive weights  measuring the prior belief of regulatory relationships . Using expression data, we run random forest to find genes regulating gj. At each node, instead of sampling a random subset of genes from the entire set of genes; we randomly choose an integer  and we sample genes according to weights . The final network is derived by ranking potential regulators based on the random forest importance score
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4542785&req=5

btv268-F1: iRafNet schematics. For each gene , we determine a ranked list of potential regulators via iRafNet. Based on each data , we derive weights measuring the prior belief of regulatory relationships . Using expression data, we run random forest to find genes regulating gj. At each node, instead of sampling a random subset of genes from the entire set of genes; we randomly choose an integer and we sample genes according to weights . The final network is derived by ranking potential regulators based on the random forest importance score
Mentions: In this article, we introduce a weighted sampling scheme under the framework of random forest to allow the integration of heterogeneous data types. As shown in Figure 1, first, iRafNet processes supporting data to derive the prior belief of regulatory relationships among genes, then, it integrates such prior information to the main dataset via random forest to construct the final GRN. We consider different genomic data including gene expression data from steady-state experiments, time-series experiments, knockout experiments and other biological data such as protein–protein interactions. As shown in Figure 1, one data source is considered as main input data for random forest inference while other D datasets (supporting data) are utilized to derive prior information. iRafNet can be summarized in the following major steps, and detailed information regarding each step is provided in later sections:

Bottom Line: Gene regulatory network (GRN) inference based on genomic data is one of the most actively pursued computational biological problems.Because different types of biological data usually provide complementary information regarding the underlying GRN, a model that integrates big data of diverse types is expected to increase both the power and accuracy of GRN inference.We apply iRafNet to construct GRN in Saccharomyces cerevisiae and demonstrate that it improves the performance in predicting TF-target gene regulations and provides additional functional insights to the predicted gene regulations.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.

Show MeSH