Limits...
Modelling nonstationary gene regulatory processes.

Grzegorcyzk M, Husmeier D, Rahnenführer J - Adv Bioinformatics (2010)

Bottom Line: The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes.In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach.We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany.

ABSTRACT
An important objective in systems biology is to infer gene regulatory networks from postgenomic data, and dynamic Bayesian networks have been widely applied as a popular tool to this end. The standard approach for nondiscretised data is restricted to a linear model and a homogeneous Markov chain. Recently, various generalisations based on changepoint processes and free allocation mixture models have been proposed. The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes. In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach. We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series. We further cross-compare the performance of both models on three biological systems: macrophages challenged with viral infection, circadian regulation in Arabidopsis thaliana, and morphogenesis in Drosophila melanogaster.

No MeSH data available.


Related in: MedlinePlus

Heat maps. Arabidopsis data. Graphical heat map representations of the temporal connectivity structures for the Arabidopsis thaliana data. (a) and (b): heat matrices for experiments T20 (a) and T28 (b) inferred with the BGM model. (c) and (d): heat matrices for experiments T20 (c) and T28 (d) inferred with the novel BGMD model. Each heat map indicates the posterior probability of two time points being assigned to the same compartment (mixture component). The probabilities are represented by a grey shading, where white corresponds to a probability of 1, and black corresponds to a probability of 0. The numbers on the axes represent the time points of the time course experiment.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2913537&req=5

fig4: Heat maps. Arabidopsis data. Graphical heat map representations of the temporal connectivity structures for the Arabidopsis thaliana data. (a) and (b): heat matrices for experiments T20 (a) and T28 (b) inferred with the BGM model. (c) and (d): heat matrices for experiments T20 (c) and T28 (d) inferred with the novel BGMD model. Each heat map indicates the posterior probability of two time points being assigned to the same compartment (mixture component). The probabilities are represented by a grey shading, where white corresponds to a probability of 1, and black corresponds to a probability of 0. The numbers on the axes represent the time points of the time course experiment.

Mentions: For the Arabidopsis thaliana data, the BGM model also inferred a biologically plausible two-stage process [4]. In this application, the two stages are likely to be related to the diurnal nature of the dark-light cycle influencing the circadian genes. The plants were subjected to different prehistories, related to different lengths of the artificial, experimentally controlled light-dark cycle. The plants in experimental scenario T28 were entrained in an increased day length of 14 hours light followed by 14 hours darkness and in experiment T20 the plants were entrained in a decreased day length of 10 hours light followed by 10 hours darkness. As an effect of these two entrainments, a phase shift in the gene-regulatory processes between these two experiments was expected [4]. The BGM model inferred a certain trend for a phase shift of the changepoint (subjective day to subjective night) of about 4–6 hours as a consequence of the increased day length. The heat maps in Figures 4(a) and 4(b) show that the connected blocks (compartments) of the time series are shifted along the diagonal by 2-3 time-points (4–6 hours). The BGMD model infers the same trend but with a stronger separation score between these compartments (see Figures 4(c) and 4(d)). We note that the BGMD model is based on changepoints so that compartments once left cannot be revisited. That is, while the BGM model tends to allocate the first time points (t2, t3) and the last time points (t9,…, t13) in experiment T28 to one single component (light grey shading in the top right and bottom left area of the heat map in the top centre panel of Figure 4), the BGMD model has to allocate the last time points (t9,…, t13) to an additional third component, as the first compartment (t2, t3) cannot be reused after the transition to the second compartment (t4,…, t8).


Modelling nonstationary gene regulatory processes.

Grzegorcyzk M, Husmeier D, Rahnenführer J - Adv Bioinformatics (2010)

Heat maps. Arabidopsis data. Graphical heat map representations of the temporal connectivity structures for the Arabidopsis thaliana data. (a) and (b): heat matrices for experiments T20 (a) and T28 (b) inferred with the BGM model. (c) and (d): heat matrices for experiments T20 (c) and T28 (d) inferred with the novel BGMD model. Each heat map indicates the posterior probability of two time points being assigned to the same compartment (mixture component). The probabilities are represented by a grey shading, where white corresponds to a probability of 1, and black corresponds to a probability of 0. The numbers on the axes represent the time points of the time course experiment.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2913537&req=5

fig4: Heat maps. Arabidopsis data. Graphical heat map representations of the temporal connectivity structures for the Arabidopsis thaliana data. (a) and (b): heat matrices for experiments T20 (a) and T28 (b) inferred with the BGM model. (c) and (d): heat matrices for experiments T20 (c) and T28 (d) inferred with the novel BGMD model. Each heat map indicates the posterior probability of two time points being assigned to the same compartment (mixture component). The probabilities are represented by a grey shading, where white corresponds to a probability of 1, and black corresponds to a probability of 0. The numbers on the axes represent the time points of the time course experiment.
Mentions: For the Arabidopsis thaliana data, the BGM model also inferred a biologically plausible two-stage process [4]. In this application, the two stages are likely to be related to the diurnal nature of the dark-light cycle influencing the circadian genes. The plants were subjected to different prehistories, related to different lengths of the artificial, experimentally controlled light-dark cycle. The plants in experimental scenario T28 were entrained in an increased day length of 14 hours light followed by 14 hours darkness and in experiment T20 the plants were entrained in a decreased day length of 10 hours light followed by 10 hours darkness. As an effect of these two entrainments, a phase shift in the gene-regulatory processes between these two experiments was expected [4]. The BGM model inferred a certain trend for a phase shift of the changepoint (subjective day to subjective night) of about 4–6 hours as a consequence of the increased day length. The heat maps in Figures 4(a) and 4(b) show that the connected blocks (compartments) of the time series are shifted along the diagonal by 2-3 time-points (4–6 hours). The BGMD model infers the same trend but with a stronger separation score between these compartments (see Figures 4(c) and 4(d)). We note that the BGMD model is based on changepoints so that compartments once left cannot be revisited. That is, while the BGM model tends to allocate the first time points (t2, t3) and the last time points (t9,…, t13) in experiment T28 to one single component (light grey shading in the top right and bottom left area of the heat map in the top centre panel of Figure 4), the BGMD model has to allocate the last time points (t9,…, t13) to an additional third component, as the first compartment (t2, t3) cannot be reused after the transition to the second compartment (t4,…, t8).

Bottom Line: The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes.In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach.We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany.

ABSTRACT
An important objective in systems biology is to infer gene regulatory networks from postgenomic data, and dynamic Bayesian networks have been widely applied as a popular tool to this end. The standard approach for nondiscretised data is restricted to a linear model and a homogeneous Markov chain. Recently, various generalisations based on changepoint processes and free allocation mixture models have been proposed. The former aim to relax the homogeneity assumption, whereas the latter are more flexible and, in principle, more adequate for modelling nonlinear processes. In our paper, we compare both paradigms and discuss theoretical shortcomings of the latter approach. We show that a model based on the changepoint process yields systematically better results than the free allocation model when inferring nonstationary gene regulatory processes from simulated gene expression time series. We further cross-compare the performance of both models on three biological systems: macrophages challenged with viral infection, circadian regulation in Arabidopsis thaliana, and morphogenesis in Drosophila melanogaster.

No MeSH data available.


Related in: MedlinePlus