Limits...
Bayesian history matching of complex infectious disease models using emulation: a tutorial and a case study on HIV in Uganda.

Andrianakis I, Vernon IR, McCreesh N, McKinley TJ, Oakley JE, Nsubuga RN, Goldstein M, White RG - PLoS Comput. Biol. (2015)

Bottom Line: History matching is an iterative procedure that reduces the simulator's input space by identifying and discarding areas that are unlikely to provide a good match to the empirical data.Simulator evaluations made within this region were found to have a 65% probability of fitting all 18 outputs.Further research is required to explicitly address the stochastic nature of the simulator as well as to account for correlations between outputs.

View Article: PubMed Central - PubMed

Affiliation: Dept. of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom.

ABSTRACT
Advances in scientific computing have allowed the development of complex models that are being routinely applied to problems in disease epidemiology, public health and decision making. The utility of these models depends in part on how well they can reproduce empirical data. However, fitting such models to real world data is greatly hindered both by large numbers of input and output parameters, and by long run times, such that many modelling studies lack a formal calibration methodology. We present a novel method that has the potential to improve the calibration of complex infectious disease models (hereafter called simulators). We present this in the form of a tutorial and a case study where we history match a dynamic, event-driven, individual-based stochastic HIV simulator, using extensive demographic, behavioural and epidemiological data available from Uganda. The tutorial describes history matching and emulation. History matching is an iterative procedure that reduces the simulator's input space by identifying and discarding areas that are unlikely to provide a good match to the empirical data. History matching relies on the computational efficiency of a Bayesian representation of the simulator, known as an emulator. Emulators mimic the simulator's behaviour, but are often several orders of magnitude faster to evaluate. In the case study, we use a 22 input simulator, fitting its 18 outputs simultaneously. After 9 iterations of history matching, a non-implausible region of the simulator input space was identified that was 10(11) times smaller than the original input space. Simulator evaluations made within this region were found to have a 65% probability of fitting all 18 outputs. History matching and emulation are useful additions to the toolbox of infectious disease modellers. Further research is required to explicitly address the stochastic nature of the simulator as well as to account for correlations between outputs.

Show MeSH

Related in: MedlinePlus

Cumulative distribution function of simulator run implausibility  by waves.Each line represents the percentage of each wave's simulator runs with an  less than the value indicated by the x-axis.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4288726&req=5

pcbi-1003968-g007: Cumulative distribution function of simulator run implausibility by waves.Each line represents the percentage of each wave's simulator runs with an less than the value indicated by the x-axis.

Mentions: In this section we examine the fit of the simulator output to the empirical data in successive waves. We first define the implausibility for one output of the actual simulator runs as (13)with and the run sample mean and variance as defined in equations 3 and 4. We also define the maximum implausibility of a run at input as . Note that this version of the implausibility does not include code uncertainty, as the simulator has been evaluated at and that the ensemble variability is estimated directly from the simulator run (and so we may now describe runs as ‘acceptable’ if their implausibility is low). Fig. 7 shows the implausibility of the simulator runs in successive waves. In wave 9, 50% of the runs were non-implausible while in wave 10, the non-implausible (or acceptable) runs were 65% of the total number of runs, all coming from a region that is a tiny fraction () of the original input space. As we were then in a position to generate large numbers of acceptable runs from the non-implausible region with a 65% acceptance rate, the history match was concluded.


Bayesian history matching of complex infectious disease models using emulation: a tutorial and a case study on HIV in Uganda.

Andrianakis I, Vernon IR, McCreesh N, McKinley TJ, Oakley JE, Nsubuga RN, Goldstein M, White RG - PLoS Comput. Biol. (2015)

Cumulative distribution function of simulator run implausibility  by waves.Each line represents the percentage of each wave's simulator runs with an  less than the value indicated by the x-axis.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4288726&req=5

pcbi-1003968-g007: Cumulative distribution function of simulator run implausibility by waves.Each line represents the percentage of each wave's simulator runs with an less than the value indicated by the x-axis.
Mentions: In this section we examine the fit of the simulator output to the empirical data in successive waves. We first define the implausibility for one output of the actual simulator runs as (13)with and the run sample mean and variance as defined in equations 3 and 4. We also define the maximum implausibility of a run at input as . Note that this version of the implausibility does not include code uncertainty, as the simulator has been evaluated at and that the ensemble variability is estimated directly from the simulator run (and so we may now describe runs as ‘acceptable’ if their implausibility is low). Fig. 7 shows the implausibility of the simulator runs in successive waves. In wave 9, 50% of the runs were non-implausible while in wave 10, the non-implausible (or acceptable) runs were 65% of the total number of runs, all coming from a region that is a tiny fraction () of the original input space. As we were then in a position to generate large numbers of acceptable runs from the non-implausible region with a 65% acceptance rate, the history match was concluded.

Bottom Line: History matching is an iterative procedure that reduces the simulator's input space by identifying and discarding areas that are unlikely to provide a good match to the empirical data.Simulator evaluations made within this region were found to have a 65% probability of fitting all 18 outputs.Further research is required to explicitly address the stochastic nature of the simulator as well as to account for correlations between outputs.

View Article: PubMed Central - PubMed

Affiliation: Dept. of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom.

ABSTRACT
Advances in scientific computing have allowed the development of complex models that are being routinely applied to problems in disease epidemiology, public health and decision making. The utility of these models depends in part on how well they can reproduce empirical data. However, fitting such models to real world data is greatly hindered both by large numbers of input and output parameters, and by long run times, such that many modelling studies lack a formal calibration methodology. We present a novel method that has the potential to improve the calibration of complex infectious disease models (hereafter called simulators). We present this in the form of a tutorial and a case study where we history match a dynamic, event-driven, individual-based stochastic HIV simulator, using extensive demographic, behavioural and epidemiological data available from Uganda. The tutorial describes history matching and emulation. History matching is an iterative procedure that reduces the simulator's input space by identifying and discarding areas that are unlikely to provide a good match to the empirical data. History matching relies on the computational efficiency of a Bayesian representation of the simulator, known as an emulator. Emulators mimic the simulator's behaviour, but are often several orders of magnitude faster to evaluate. In the case study, we use a 22 input simulator, fitting its 18 outputs simultaneously. After 9 iterations of history matching, a non-implausible region of the simulator input space was identified that was 10(11) times smaller than the original input space. Simulator evaluations made within this region were found to have a 65% probability of fitting all 18 outputs. History matching and emulation are useful additions to the toolbox of infectious disease modellers. Further research is required to explicitly address the stochastic nature of the simulator as well as to account for correlations between outputs.

Show MeSH
Related in: MedlinePlus