Bayesian model comparison and parameter inference in systems biology using nested sampling.
Bottom Line:
We demonstrate how nested sampling can be used to reverse-engineer a system's behaviour whilst accounting for the uncertainty in the results.We show how the evidence and the model ranking can change as a function of the available data.Furthermore, the addition of data from extra variables of the system can deliver more information for model comparison than increasing the data from one variable, thus providing a basis for experimental design.
View Article:
PubMed Central - PubMed
Affiliation: Computational and Systems Biology, John Innes Centre, Norwich, United Kingdom.
ABSTRACT
Show MeSH
Inferring parameters for models of biological processes is a current challenge in systems biology, as is the related problem of comparing competing models that explain the data. In this work we apply Skilling's nested sampling to address both of these problems. Nested sampling is a Bayesian method for exploring parameter space that transforms a multi-dimensional integral to a 1D integration over likelihood space. This approach focuses on the computation of the marginal likelihood or evidence. The ratio of evidences of different models leads to the Bayes factor, which can be used for model comparison. We demonstrate how nested sampling can be used to reverse-engineer a system's behaviour whilst accounting for the uncertainty in the results. The effect of missing initial conditions of the variables as well as unknown parameters is investigated. We show how the evidence and the model ranking can change as a function of the available data. Furthermore, the addition of data from extra variables of the system can deliver more information for model comparison than increasing the data from one variable, thus providing a basis for experimental design. Related in: MedlinePlus |
Related In:
Results -
Collection
License getmorefigures.php?uid=PMC3921180&req=5
Mentions: Given the nature of the sparse and noisy data it is not too surprising that a simpler model with two variables and six parameters is given preference over the model with six variables and eight parameters from which the data were actually generated. If the data are of better quality i.e. no noise and of greater density, we can see the repressilator model gaining more support (Figure 5) relative to the Lotka-Volterra system, but until an unreasonable amount of data is available (500 data points) the Lotka-Volterra model is preferred due to the it being the more parsimonious explanation of the data — visually both systems can fit the given data very well. Perhaps counter-intuitively, the evidence decreases with the increasing quantity of data. This is due to the log-likelihood function. As there are now more data points, unless the fit is exceptionally good, the least-squares residual increases due to summing up more errors. The evidence comprises both the Occam factor and the best fit likelihood (at least assuming the posterior is approximately Gaussian) [28]. Hence a worse likelihood score will similarly affect the evidence. |
View Article: PubMed Central - PubMed
Affiliation: Computational and Systems Biology, John Innes Centre, Norwich, United Kingdom.