Limits...
Gene network inference by fusing data from diverse distributions.

Žitnik M, Zupan B - Bioinformatics (2015)

Bottom Line: In a simulation study, we demonstrate good predictive performance of FuseNet in comparison to several popular graphical models.Fusion of datasets offers substantial gains relative to inference of separate networks for each dataset.Our results demonstrate that network inference methods for non-Gaussian data can help in accurate modeling of the data generated by emergent high-throughput technologies.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.

Show MeSH

Related in: MedlinePlus

The strength of association between gene sets from the Gene Ontology (GO) and networks inferred with FuseNet. Inferred networks were overlaid with GO terms and subnetworks induced by each GO term were assessed for how well they corresponded to network communities. Four different scoring functions were used to quantify the presence of different structural notions of communities (Supplementary Section S4) that can appear in biological networks: flake-over-median-degree (flake-ODF), cut ratio, triangle participation ratio (TPR) and conductance. Considering breast cancer RNA-sequencing (RNA-seq) and somatic mutation data (Mut), these boxplots show the gains that fusion of data from different distributions (Mut & RNA-seq) can offer over network inference from any dataset alone, either RNA-seq or Mut. Poisson FuseNet was used with RNA-sequencing data, multinomial FuseNet with somatic mutation data and fully-specified FuseNet for joint consideration of RNA-sequencing and mutation data
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4542780&req=5

btv258-F4: The strength of association between gene sets from the Gene Ontology (GO) and networks inferred with FuseNet. Inferred networks were overlaid with GO terms and subnetworks induced by each GO term were assessed for how well they corresponded to network communities. Four different scoring functions were used to quantify the presence of different structural notions of communities (Supplementary Section S4) that can appear in biological networks: flake-over-median-degree (flake-ODF), cut ratio, triangle participation ratio (TPR) and conductance. Considering breast cancer RNA-sequencing (RNA-seq) and somatic mutation data (Mut), these boxplots show the gains that fusion of data from different distributions (Mut & RNA-seq) can offer over network inference from any dataset alone, either RNA-seq or Mut. Poisson FuseNet was used with RNA-sequencing data, multinomial FuseNet with somatic mutation data and fully-specified FuseNet for joint consideration of RNA-sequencing and mutation data

Mentions: To characterize how functionally informative the inferred networks are, we employ four structural definitions of network communities (Fig. 4 and Supplementary Figs S6 and S7). These represent four possible notions of association between a given GO term and the inferred network (Yang and Leskovec, 2012). The triangle participation ratio quantifies how well genes that are members of a given GO term are linked to each other in the inferred network. The cut ratio captures the abundance of external connectivity, i.e. edges between genes of a GO term and the rest of the network, whereas conductance and flake-ODF consider both internal and external network connectivity. Through these four measures we are able to estimate the overall concordance of inferred gene networks and known functional annotation of genes. For these reasons, networks that score higher on many measures should be considered more informative across a wider spectrum of cellular functions.Fig. 4.


Gene network inference by fusing data from diverse distributions.

Žitnik M, Zupan B - Bioinformatics (2015)

The strength of association between gene sets from the Gene Ontology (GO) and networks inferred with FuseNet. Inferred networks were overlaid with GO terms and subnetworks induced by each GO term were assessed for how well they corresponded to network communities. Four different scoring functions were used to quantify the presence of different structural notions of communities (Supplementary Section S4) that can appear in biological networks: flake-over-median-degree (flake-ODF), cut ratio, triangle participation ratio (TPR) and conductance. Considering breast cancer RNA-sequencing (RNA-seq) and somatic mutation data (Mut), these boxplots show the gains that fusion of data from different distributions (Mut & RNA-seq) can offer over network inference from any dataset alone, either RNA-seq or Mut. Poisson FuseNet was used with RNA-sequencing data, multinomial FuseNet with somatic mutation data and fully-specified FuseNet for joint consideration of RNA-sequencing and mutation data
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4542780&req=5

btv258-F4: The strength of association between gene sets from the Gene Ontology (GO) and networks inferred with FuseNet. Inferred networks were overlaid with GO terms and subnetworks induced by each GO term were assessed for how well they corresponded to network communities. Four different scoring functions were used to quantify the presence of different structural notions of communities (Supplementary Section S4) that can appear in biological networks: flake-over-median-degree (flake-ODF), cut ratio, triangle participation ratio (TPR) and conductance. Considering breast cancer RNA-sequencing (RNA-seq) and somatic mutation data (Mut), these boxplots show the gains that fusion of data from different distributions (Mut & RNA-seq) can offer over network inference from any dataset alone, either RNA-seq or Mut. Poisson FuseNet was used with RNA-sequencing data, multinomial FuseNet with somatic mutation data and fully-specified FuseNet for joint consideration of RNA-sequencing and mutation data
Mentions: To characterize how functionally informative the inferred networks are, we employ four structural definitions of network communities (Fig. 4 and Supplementary Figs S6 and S7). These represent four possible notions of association between a given GO term and the inferred network (Yang and Leskovec, 2012). The triangle participation ratio quantifies how well genes that are members of a given GO term are linked to each other in the inferred network. The cut ratio captures the abundance of external connectivity, i.e. edges between genes of a GO term and the rest of the network, whereas conductance and flake-ODF consider both internal and external network connectivity. Through these four measures we are able to estimate the overall concordance of inferred gene networks and known functional annotation of genes. For these reasons, networks that score higher on many measures should be considered more informative across a wider spectrum of cellular functions.Fig. 4.

Bottom Line: In a simulation study, we demonstrate good predictive performance of FuseNet in comparison to several popular graphical models.Fusion of datasets offers substantial gains relative to inference of separate networks for each dataset.Our results demonstrate that network inference methods for non-Gaussian data can help in accurate modeling of the data generated by emergent high-throughput technologies.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.

Show MeSH
Related in: MedlinePlus