Limits...
How to Estimate Epidemic Risk from Incomplete Contact Diaries Data?

Mastrandrea R, Barrat A - PLoS Comput. Biol. (2016)

Bottom Line: Most importantly, we investigate if and how information gathered from contact diaries can be used in such simulations in order to yield an accurate description of the epidemic risk, assuming that data from sensors represent the ground truth.The contact networks built from contact sensors and diaries present indeed several structural similarities: this suggests the possibility to construct, using only the contact diary network information, a surrogate contact network such that simulations using this surrogate network give the same estimation of the epidemic risk as simulations using the contact sensor network.We present and compare several methods to build such surrogate data, and show that it is indeed possible to obtain a good agreement between the outcomes of simulations using surrogate and sensor data, as long as the contact diary information is complemented by publicly available data describing the heterogeneity of the durations of human contacts.

View Article: PubMed Central - PubMed

Affiliation: Aix Marseille Univ, Univ Toulon, CNRS, CPT, Marseille, France.

ABSTRACT
Social interactions shape the patterns of spreading processes in a population. Techniques such as diaries or proximity sensors allow to collect data about encounters and to build networks of contacts between individuals. The contact networks obtained from these different techniques are however quantitatively different. Here, we first show how these discrepancies affect the prediction of the epidemic risk when these data are fed to numerical models of epidemic spread: low participation rate, under-reporting of contacts and overestimation of contact durations in contact diaries with respect to sensor data determine indeed important differences in the outcomes of the corresponding simulations with for instance an enhanced sensitivity to initial conditions. Most importantly, we investigate if and how information gathered from contact diaries can be used in such simulations in order to yield an accurate description of the epidemic risk, assuming that data from sensors represent the ground truth. The contact networks built from contact sensors and diaries present indeed several structural similarities: this suggests the possibility to construct, using only the contact diary network information, a surrogate contact network such that simulations using this surrogate network give the same estimation of the epidemic risk as simulations using the contact sensor network. We present and compare several methods to build such surrogate data, and show that it is indeed possible to obtain a good agreement between the outcomes of simulations using surrogate and sensor data, as long as the contact diary information is complemented by publicly available data describing the heterogeneity of the durations of human contacts.

No MeSH data available.


Related in: MedlinePlus

Distribution of final size of epidemics.1000 SIR simulations performed on the original contact sensors network (CSN) and the original contact diaries network with durations respectively reported by students (CDND) and registered by sensors (CDNS and CDNS’). Each process starts with one random infected seed. β/μ = 30.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4920368&req=5

pcbi.1005002.g001: Distribution of final size of epidemics.1000 SIR simulations performed on the original contact sensors network (CSN) and the original contact diaries network with durations respectively reported by students (CDND) and registered by sensors (CDNS and CDNS’). Each process starts with one random infected seed. β/μ = 30.

Mentions: We first compare in Fig 1 the outcome of simulations of the SIR model performed on the CSN and on the two versions of the CDN described above (CDND with weights reported by students and CDNS with weights registered by sensors assigned randomly to the links), for one specific value of β/μ = 30. The three distributions of epidemic sizes are very different from each other. The outcome of simulations performed using CSN is quite standard, with a fraction of small outbreaks that reach only a small fraction of the population and another peak corresponding to large outbreaks. As shown in the Supporting Information, the outcome does not depend on the class of the initial seed. The shape of the distribution obtained when using the CDND is more peculiar, with a series of peaks, including one at very large epidemic sizes. Such structure is typical of spreading processes on networks with a strong community structure [4], which corresponds to the results of [21]: (i) due to the low participation rate and the under-reporting, the community structure of the CDN is stronger than the one of the CSN, with few links between classes; depending on the seed, the simulated disease can thus remain confined in one class or in a group of few classes, leading to the peaks at intermediate values of the epidemic size; we moreover show in the SI that the outcome depends on the class of the initial seed for the CDN but not for the CSN; (ii) on the other hand, as contact durations are overestimated, the propagation probability on each link is also overestimated and, if the disease manages to spread between classes, almost all individuals are affected, leading to the peak at large epidemic sizes. The CDNS case shows a different result: no more than half of the whole population is affected by the spread. As the weights have in this case the same statistics as the CSN, this is simply due to the low participation rate [40] and the much smaller average degree in the CDN with respect to the CSN. We also note that, since the weights are assigned randomly to the links between students, the structure of the contact matrix giving the average durations of contacts between students of different classes can strongly differ between the CDNS and both the CSN and the CDND, leading to different patterns of propagation between classes (see Supporting Information). We finally note that the simulations on the CDNS’, which keeps the distribution of the weights from CSN and in which larger weights are assigned to links with longer reported durations, yield even smaller outbreaks. This is probably due to the fact that the large weights reported in the diaries tend to be within classes, so that the links bridging classes and favoring the spread tend to have smaller weights in the CDNS’ than in the CDNS. We also show in the SI the temporal evolution of the density of infectious individuals for the various cases considered here.


How to Estimate Epidemic Risk from Incomplete Contact Diaries Data?

Mastrandrea R, Barrat A - PLoS Comput. Biol. (2016)

Distribution of final size of epidemics.1000 SIR simulations performed on the original contact sensors network (CSN) and the original contact diaries network with durations respectively reported by students (CDND) and registered by sensors (CDNS and CDNS’). Each process starts with one random infected seed. β/μ = 30.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4920368&req=5

pcbi.1005002.g001: Distribution of final size of epidemics.1000 SIR simulations performed on the original contact sensors network (CSN) and the original contact diaries network with durations respectively reported by students (CDND) and registered by sensors (CDNS and CDNS’). Each process starts with one random infected seed. β/μ = 30.
Mentions: We first compare in Fig 1 the outcome of simulations of the SIR model performed on the CSN and on the two versions of the CDN described above (CDND with weights reported by students and CDNS with weights registered by sensors assigned randomly to the links), for one specific value of β/μ = 30. The three distributions of epidemic sizes are very different from each other. The outcome of simulations performed using CSN is quite standard, with a fraction of small outbreaks that reach only a small fraction of the population and another peak corresponding to large outbreaks. As shown in the Supporting Information, the outcome does not depend on the class of the initial seed. The shape of the distribution obtained when using the CDND is more peculiar, with a series of peaks, including one at very large epidemic sizes. Such structure is typical of spreading processes on networks with a strong community structure [4], which corresponds to the results of [21]: (i) due to the low participation rate and the under-reporting, the community structure of the CDN is stronger than the one of the CSN, with few links between classes; depending on the seed, the simulated disease can thus remain confined in one class or in a group of few classes, leading to the peaks at intermediate values of the epidemic size; we moreover show in the SI that the outcome depends on the class of the initial seed for the CDN but not for the CSN; (ii) on the other hand, as contact durations are overestimated, the propagation probability on each link is also overestimated and, if the disease manages to spread between classes, almost all individuals are affected, leading to the peak at large epidemic sizes. The CDNS case shows a different result: no more than half of the whole population is affected by the spread. As the weights have in this case the same statistics as the CSN, this is simply due to the low participation rate [40] and the much smaller average degree in the CDN with respect to the CSN. We also note that, since the weights are assigned randomly to the links between students, the structure of the contact matrix giving the average durations of contacts between students of different classes can strongly differ between the CDNS and both the CSN and the CDND, leading to different patterns of propagation between classes (see Supporting Information). We finally note that the simulations on the CDNS’, which keeps the distribution of the weights from CSN and in which larger weights are assigned to links with longer reported durations, yield even smaller outbreaks. This is probably due to the fact that the large weights reported in the diaries tend to be within classes, so that the links bridging classes and favoring the spread tend to have smaller weights in the CDNS’ than in the CDNS. We also show in the SI the temporal evolution of the density of infectious individuals for the various cases considered here.

Bottom Line: Most importantly, we investigate if and how information gathered from contact diaries can be used in such simulations in order to yield an accurate description of the epidemic risk, assuming that data from sensors represent the ground truth.The contact networks built from contact sensors and diaries present indeed several structural similarities: this suggests the possibility to construct, using only the contact diary network information, a surrogate contact network such that simulations using this surrogate network give the same estimation of the epidemic risk as simulations using the contact sensor network.We present and compare several methods to build such surrogate data, and show that it is indeed possible to obtain a good agreement between the outcomes of simulations using surrogate and sensor data, as long as the contact diary information is complemented by publicly available data describing the heterogeneity of the durations of human contacts.

View Article: PubMed Central - PubMed

Affiliation: Aix Marseille Univ, Univ Toulon, CNRS, CPT, Marseille, France.

ABSTRACT
Social interactions shape the patterns of spreading processes in a population. Techniques such as diaries or proximity sensors allow to collect data about encounters and to build networks of contacts between individuals. The contact networks obtained from these different techniques are however quantitatively different. Here, we first show how these discrepancies affect the prediction of the epidemic risk when these data are fed to numerical models of epidemic spread: low participation rate, under-reporting of contacts and overestimation of contact durations in contact diaries with respect to sensor data determine indeed important differences in the outcomes of the corresponding simulations with for instance an enhanced sensitivity to initial conditions. Most importantly, we investigate if and how information gathered from contact diaries can be used in such simulations in order to yield an accurate description of the epidemic risk, assuming that data from sensors represent the ground truth. The contact networks built from contact sensors and diaries present indeed several structural similarities: this suggests the possibility to construct, using only the contact diary network information, a surrogate contact network such that simulations using this surrogate network give the same estimation of the epidemic risk as simulations using the contact sensor network. We present and compare several methods to build such surrogate data, and show that it is indeed possible to obtain a good agreement between the outcomes of simulations using surrogate and sensor data, as long as the contact diary information is complemented by publicly available data describing the heterogeneity of the durations of human contacts.

No MeSH data available.


Related in: MedlinePlus