Limits...
Reconstruction of large-scale regulatory networks based on perturbation graphs and transitive reduction: improved methods and their evaluation.

Pinna A, Heise S, Flassig RJ, de la Fuente A, Klamt S - BMC Syst Biol (2013)

Bottom Line: In this work we introduce novel variants for PG generation and TR, leading to significantly improved performances.The benchmarks clearly demonstrate the superior reconstruction performance of the novel PG and TR variants compared to existing approaches.Moreover, the benchmark enabled us to draw some general conclusions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany. klamt@mpi-magdeburg.mpg.de.

ABSTRACT

Background: The data-driven inference of intracellular networks is one of the key challenges of computational and systems biology. As suggested by recent works, a simple yet effective approach for reconstructing regulatory networks comprises the following two steps. First, the observed effects induced by directed perturbations are collected in a signed and directed perturbation graph (PG). In a second step, Transitive Reduction (TR) is used to identify and eliminate those edges in the PG that can be explained by paths and are therefore likely to reflect indirect effects.

Results: In this work we introduce novel variants for PG generation and TR, leading to significantly improved performances. The key modifications concern: (i) use of novel statistical criteria for deriving a high-quality PG from experimental data; (ii) the application of local TR which allows only short paths to explain (and remove) a given edge; and (iii) a novel strategy to rank the edges with respect to their confidence. To compare the new methods with existing ones we not only apply them to a recent DREAM network inference challenge but also to a novel and unprecedented synthetic compendium consisting of 30,5000-gene networks simulated with varying biological and measurement error variances resulting in a total of 270 datasets. The benchmarks clearly demonstrate the superior reconstruction performance of the novel PG and TR variants compared to existing approaches. Moreover, the benchmark enabled us to draw some general conclusions. For example, it turns out that local TR restricted to paths with a length of only two is often sufficient or even favorable. We also demonstrate that considering edge weights is highly beneficial for TR whereas consideration of edge signs is of minor importance. We explain these observations from a graph-theoretical perspective and discuss the consequences with respect to a greatly reduced computational demand to conduct TR. Finally, as a realistic application scenario, we use our framework for inferring gene interactions in yeast based on a library of gene expression data measured in mutants with single knockouts of transcription factors. The reconstructed network shows a significant enrichment of known interactions, especially within the 100 most confident (and for experimental validation most relevant) edges.

Conclusions: This paper presents several major achievements. The novel methods introduced herein can be seen as state of the art for inference techniques relying on perturbation graphs and transitive reduction. Another key result of the study is the generation of a new and unprecedented large-scale in silico benchmark dataset accounting for different noise levels and providing a solid basis for unbiased testing of network inference methodologies. Finally, applying our approach to Saccharomyces cerevisiae suggested several new gene interactions with high confidence awaiting experimental validation.

Show MeSH

Related in: MedlinePlus

Performance of the new TRANSWESD and LTR variants on the SysGenSIM dataset. Parameters used to obtain the perturbation graph were β = 2.0 and γ = 0.05, while α = 0.95 and α = 0.15 were selected for the TRANSWESD and LTR variants, respectively. AUPR scores are averaged across the 30 networks (10 networks for each of the three averaged node degrees considered) simulated with the same noise configuration.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4231426&req=5

Figure 5: Performance of the new TRANSWESD and LTR variants on the SysGenSIM dataset. Parameters used to obtain the perturbation graph were β = 2.0 and γ = 0.05, while α = 0.95 and α = 0.15 were selected for the TRANSWESD and LTR variants, respectively. AUPR scores are averaged across the 30 networks (10 networks for each of the three averaged node degrees considered) simulated with the same noise configuration.

Mentions: The effect of the TR algorithms applied to PGnew (Tables3 and4, Figure5) becomes more heterogeneous and differentiated compared to the DREAM4 networks. First of all, we observe that the unweighted versions of LTR decrease in all cases the quality of the perturbation graph PGnew whereas weighted LTR and (non-local versions of) TRANSWESD improve it – partially significantly – in all scenarios (with one minor exception). This demonstrates that weighted TR can be highly beneficial. However, local TRANSWESDs,w,2, which was comparable with LTR in the DREAM4 networks, achieves similar unfavorable results for these large and noisy networks as unweighted LTR. This confirms again that rule (3) seems to be better suited for local TR than rule (2). Furthermore, the quality of the PG as well as the relative improvement by the (weighted) TR techniques depends substantially on the magnitude of the noise level both with respect to AUPR and in the number of TPs and FPs. An interesting observation can be made regarding the effect of biological variance on the reconstruction quality: it appears that moderately increased (medium) biological noise is advantageous in case of high measurement noise for all K’s (i.e., networks with noise configuration MH perform better than those with LH; see Figure5 and Table3 as well as Tables T1 and T2 and Figure F3 in Additional file1). Thus, higher biological noise may help to uncover true perturbation effects under high uncertainty of measurements.


Reconstruction of large-scale regulatory networks based on perturbation graphs and transitive reduction: improved methods and their evaluation.

Pinna A, Heise S, Flassig RJ, de la Fuente A, Klamt S - BMC Syst Biol (2013)

Performance of the new TRANSWESD and LTR variants on the SysGenSIM dataset. Parameters used to obtain the perturbation graph were β = 2.0 and γ = 0.05, while α = 0.95 and α = 0.15 were selected for the TRANSWESD and LTR variants, respectively. AUPR scores are averaged across the 30 networks (10 networks for each of the three averaged node degrees considered) simulated with the same noise configuration.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4231426&req=5

Figure 5: Performance of the new TRANSWESD and LTR variants on the SysGenSIM dataset. Parameters used to obtain the perturbation graph were β = 2.0 and γ = 0.05, while α = 0.95 and α = 0.15 were selected for the TRANSWESD and LTR variants, respectively. AUPR scores are averaged across the 30 networks (10 networks for each of the three averaged node degrees considered) simulated with the same noise configuration.
Mentions: The effect of the TR algorithms applied to PGnew (Tables3 and4, Figure5) becomes more heterogeneous and differentiated compared to the DREAM4 networks. First of all, we observe that the unweighted versions of LTR decrease in all cases the quality of the perturbation graph PGnew whereas weighted LTR and (non-local versions of) TRANSWESD improve it – partially significantly – in all scenarios (with one minor exception). This demonstrates that weighted TR can be highly beneficial. However, local TRANSWESDs,w,2, which was comparable with LTR in the DREAM4 networks, achieves similar unfavorable results for these large and noisy networks as unweighted LTR. This confirms again that rule (3) seems to be better suited for local TR than rule (2). Furthermore, the quality of the PG as well as the relative improvement by the (weighted) TR techniques depends substantially on the magnitude of the noise level both with respect to AUPR and in the number of TPs and FPs. An interesting observation can be made regarding the effect of biological variance on the reconstruction quality: it appears that moderately increased (medium) biological noise is advantageous in case of high measurement noise for all K’s (i.e., networks with noise configuration MH perform better than those with LH; see Figure5 and Table3 as well as Tables T1 and T2 and Figure F3 in Additional file1). Thus, higher biological noise may help to uncover true perturbation effects under high uncertainty of measurements.

Bottom Line: In this work we introduce novel variants for PG generation and TR, leading to significantly improved performances.The benchmarks clearly demonstrate the superior reconstruction performance of the novel PG and TR variants compared to existing approaches.Moreover, the benchmark enabled us to draw some general conclusions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany. klamt@mpi-magdeburg.mpg.de.

ABSTRACT

Background: The data-driven inference of intracellular networks is one of the key challenges of computational and systems biology. As suggested by recent works, a simple yet effective approach for reconstructing regulatory networks comprises the following two steps. First, the observed effects induced by directed perturbations are collected in a signed and directed perturbation graph (PG). In a second step, Transitive Reduction (TR) is used to identify and eliminate those edges in the PG that can be explained by paths and are therefore likely to reflect indirect effects.

Results: In this work we introduce novel variants for PG generation and TR, leading to significantly improved performances. The key modifications concern: (i) use of novel statistical criteria for deriving a high-quality PG from experimental data; (ii) the application of local TR which allows only short paths to explain (and remove) a given edge; and (iii) a novel strategy to rank the edges with respect to their confidence. To compare the new methods with existing ones we not only apply them to a recent DREAM network inference challenge but also to a novel and unprecedented synthetic compendium consisting of 30,5000-gene networks simulated with varying biological and measurement error variances resulting in a total of 270 datasets. The benchmarks clearly demonstrate the superior reconstruction performance of the novel PG and TR variants compared to existing approaches. Moreover, the benchmark enabled us to draw some general conclusions. For example, it turns out that local TR restricted to paths with a length of only two is often sufficient or even favorable. We also demonstrate that considering edge weights is highly beneficial for TR whereas consideration of edge signs is of minor importance. We explain these observations from a graph-theoretical perspective and discuss the consequences with respect to a greatly reduced computational demand to conduct TR. Finally, as a realistic application scenario, we use our framework for inferring gene interactions in yeast based on a library of gene expression data measured in mutants with single knockouts of transcription factors. The reconstructed network shows a significant enrichment of known interactions, especially within the 100 most confident (and for experimental validation most relevant) edges.

Conclusions: This paper presents several major achievements. The novel methods introduced herein can be seen as state of the art for inference techniques relying on perturbation graphs and transitive reduction. Another key result of the study is the generation of a new and unprecedented large-scale in silico benchmark dataset accounting for different noise levels and providing a solid basis for unbiased testing of network inference methodologies. Finally, applying our approach to Saccharomyces cerevisiae suggested several new gene interactions with high confidence awaiting experimental validation.

Show MeSH
Related in: MedlinePlus