Limits...
Modelling nonstationary gene regulatory processes.

Grzegorcyzk M, Husmeier D, Rahnenführer J - Adv Bioinformatics (2010)

Bottom Line: The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes.In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach.We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany.

ABSTRACT
An important objective in systems biology is to infer gene regulatory networks from postgenomic data, and dynamic Bayesian networks have been widely applied as a popular tool to this end. The standard approach for nondiscretised data is restricted to a linear model and a homogeneous Markov chain. Recently, various generalisations based on changepoint processes and free allocation mixture models have been proposed. The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes. In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach. We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series. We further cross-compare the performance of both models on three biological systems: macrophages challenged with viral infection, circadian regulation in Arabidopsis thaliana, and morphogenesis in Drosophila melanogaster.

No MeSH data available.


Related in: MedlinePlus

Edge Posterior Probabilities—Cross-method comparison on synthetic sine data. The figure shows three histograms of the inferred marginal edge posterior probabilities in the sinusoid network with N = 2 nodes and cX = 0.5 and cY = 0.5 as obtained with BGe (a), BGM (b), and BGMD (c). In each histogram, the four bars represent the four possible edges: Left: self-loop X → X (true); centre left: X → Y (true); centre right: self-loop Y → Y (false); right: Y → X (false). Each bar shows the average marginal posterior probability, averaged over 50 independent data instantiations. It is seen that BGe and BGM have a high propensity for learning the spurious feedback loop Y → Y (centre right white bars). BGMD (right histogram) assigns a higher posterior probability to the correct edge X → Y (centre left black bar) and suppresses the spurious feedback loop Y → Y (centre right white bar)
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2913537&req=5

fig2: Edge Posterior Probabilities—Cross-method comparison on synthetic sine data. The figure shows three histograms of the inferred marginal edge posterior probabilities in the sinusoid network with N = 2 nodes and cX = 0.5 and cY = 0.5 as obtained with BGe (a), BGM (b), and BGMD (c). In each histogram, the four bars represent the four possible edges: Left: self-loop X → X (true); centre left: X → Y (true); centre right: self-loop Y → Y (false); right: Y → X (false). Each bar shows the average marginal posterior probability, averaged over 50 independent data instantiations. It is seen that BGe and BGM have a high propensity for learning the spurious feedback loop Y → Y (centre right white bars). BGMD (right histogram) assigns a higher posterior probability to the correct edge X → Y (centre left black bar) and suppresses the spurious feedback loop Y → Y (centre right white bar)

Mentions: This trend can be visualised by histograms of the average edge posterior probabilities. As an example, Figure 2 shows the average marginal edge posterior probabilities of the four possible edges for the N = 2 nodes sinusoid network with cX = 0.5 and cY = 0.5. Consistently, all three models under comparison assign the highest posterior probability to the true self loop X → X and the lowest posterior probability to the false edge Y → X. But BGe and BGM favour the spurious feed-back loop Y → Y over the true edge X → Y while the proposed BGMD suppresses the false self-feedback loop and assigns a higher edge posterior probability to the true edge X → Y. This shows that BGMD yields a higher network reconstruction accuracy (see Figure 1), as it is less susceptible to inferring spurious self-feedback loops (see Figure 2).


Modelling nonstationary gene regulatory processes.

Grzegorcyzk M, Husmeier D, Rahnenführer J - Adv Bioinformatics (2010)

Edge Posterior Probabilities—Cross-method comparison on synthetic sine data. The figure shows three histograms of the inferred marginal edge posterior probabilities in the sinusoid network with N = 2 nodes and cX = 0.5 and cY = 0.5 as obtained with BGe (a), BGM (b), and BGMD (c). In each histogram, the four bars represent the four possible edges: Left: self-loop X → X (true); centre left: X → Y (true); centre right: self-loop Y → Y (false); right: Y → X (false). Each bar shows the average marginal posterior probability, averaged over 50 independent data instantiations. It is seen that BGe and BGM have a high propensity for learning the spurious feedback loop Y → Y (centre right white bars). BGMD (right histogram) assigns a higher posterior probability to the correct edge X → Y (centre left black bar) and suppresses the spurious feedback loop Y → Y (centre right white bar)
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2913537&req=5

fig2: Edge Posterior Probabilities—Cross-method comparison on synthetic sine data. The figure shows three histograms of the inferred marginal edge posterior probabilities in the sinusoid network with N = 2 nodes and cX = 0.5 and cY = 0.5 as obtained with BGe (a), BGM (b), and BGMD (c). In each histogram, the four bars represent the four possible edges: Left: self-loop X → X (true); centre left: X → Y (true); centre right: self-loop Y → Y (false); right: Y → X (false). Each bar shows the average marginal posterior probability, averaged over 50 independent data instantiations. It is seen that BGe and BGM have a high propensity for learning the spurious feedback loop Y → Y (centre right white bars). BGMD (right histogram) assigns a higher posterior probability to the correct edge X → Y (centre left black bar) and suppresses the spurious feedback loop Y → Y (centre right white bar)
Mentions: This trend can be visualised by histograms of the average edge posterior probabilities. As an example, Figure 2 shows the average marginal edge posterior probabilities of the four possible edges for the N = 2 nodes sinusoid network with cX = 0.5 and cY = 0.5. Consistently, all three models under comparison assign the highest posterior probability to the true self loop X → X and the lowest posterior probability to the false edge Y → X. But BGe and BGM favour the spurious feed-back loop Y → Y over the true edge X → Y while the proposed BGMD suppresses the false self-feedback loop and assigns a higher edge posterior probability to the true edge X → Y. This shows that BGMD yields a higher network reconstruction accuracy (see Figure 1), as it is less susceptible to inferring spurious self-feedback loops (see Figure 2).

Bottom Line: The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes.In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach.We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany.

ABSTRACT
An important objective in systems biology is to infer gene regulatory networks from postgenomic data, and dynamic Bayesian networks have been widely applied as a popular tool to this end. The standard approach for nondiscretised data is restricted to a linear model and a homogeneous Markov chain. Recently, various generalisations based on changepoint processes and free allocation mixture models have been proposed. The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes. In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach. We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series. We further cross-compare the performance of both models on three biological systems: macrophages challenged with viral infection, circadian regulation in Arabidopsis thaliana, and morphogenesis in Drosophila melanogaster.

No MeSH data available.


Related in: MedlinePlus