Modelling nonstationary gene regulatory processes.
Bottom Line:
The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes.In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach.We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series.
View Article:
PubMed Central - PubMed
Affiliation: Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany.
ABSTRACT
An important objective in systems biology is to infer gene regulatory networks from postgenomic data, and dynamic Bayesian networks have been widely applied as a popular tool to this end. The standard approach for nondiscretised data is restricted to a linear model and a homogeneous Markov chain. Recently, various generalisations based on changepoint processes and free allocation mixture models have been proposed. The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes. In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach. We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series. We further cross-compare the performance of both models on three biological systems: macrophages challenged with viral infection, circadian regulation in Arabidopsis thaliana, and morphogenesis in Drosophila melanogaster. No MeSH data available. Related in: MedlinePlus |
Related In:
Results -
Collection
License getmorefigures.php?uid=PMC2913537&req=5
Mentions: (21)RBGMD=P(K=2)P(K=1)·∫jj+16(m−b1)(b1−2)(m−2)3db1.In the second theoretical study we vary the length of the time series m = 3,5, 7,…, 25, and we consider a heterogeneous time series consisting of two equally-spaced segments t2,…, t⌊m/2⌋+1 and t⌊m/2⌋+2,…, tm. This corresponds to m1 = m2 = 0.5 · (m − 1) in the BGM model. For the BGMD model, we obtain with j = ⌊m/2⌋ + 1 that the changepoint has to be located in the interval b1 ∈ [t⌊m/2⌋+1, t⌊m/2⌋+2]. Figures 10(a) and 10(b) show the resulting (logarithmic) prior probability ratios in dependence on m. It can be seen that the prior ratio R for the BGM model is considerably lower than for the BGMD model. Moreover, the logarithmic plot in Figure 10(b) shows that the prior ratio of the BGM model shows a much stronger decrease with the sample size m than the BGMD model. This suggests that the BGM model imposes a more severe penalty for complexity (non-stationarity), which increases with increasing sample size m. This tendency may explain the finding in [4] for the macrophage gene expression time series, which we have reproduced in the present study (Figure 3(c)): the BGM model does not infer a clear two-phase nature of the time series under simultaneous immune activation (with IFNγ) and viral infection (with CMV). A possible biological explanation was offered in [4]. However, the novel BGM model does not support the hypothesis of a decreased probability for the two-phase nature (Figure 3(f)). Moreover, the previous analysis has revealed that a strong penalty against the two-phase process is inherent in the BGM model. This suggests that the results reported in [4], which we have reproduced in our study, might be an artefact of the BGM model rather than of genuine biological nature. |
View Article: PubMed Central - PubMed
Affiliation: Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany.
No MeSH data available.