Limits...
Comparing adaptive and fixed bandwidth-based kernel density estimates in spatial cancer epidemiology.

Lemke D, Mattauch V, Heidinger O, Pebesma E, Hense HW - Int J Health Geogr (2015)

Bottom Line: We observed that the fixed bandwidth-based sRRF was distinguished by a conservative behavior in identifying these urban 'risk areas', that is, a reduced sensitivity but increased specificity due to oversmoothing as compared to the adaptive risk estimator.In contrast, the latter appeared more competitive through variance stabilization, resulting in a higher sensitivity, while the specificity was equal as compared to the fixed risk estimator.Halving the originally determined bandwidths led to a simultaneous improvement of sensitivity and specificity of the adaptive sRRF, while the specificity was reduced for the fixed estimator.

View Article: PubMed Central - PubMed

Affiliation: Institute of Epidemiology and Social Medicine, Medical Faculty, Westfälische Wilhelms-Universität Münster, Münster, Germany. dorothea.lemke@uni-muenster.de.

ABSTRACT

Background: Monitoring spatial disease risk (e.g. identifying risk areas) is of great relevance in public health research, especially in cancer epidemiology. A common strategy uses case-control studies and estimates a spatial relative risk function (sRRF) via kernel density estimation (KDE). This study was set up to evaluate the sRRF estimation methods, comparing fixed with adaptive bandwidth-based KDE, and how they were able to detect 'risk areas' with case data from a population-based cancer registry.

Methods: The sRRF were estimated within a defined area, using locational information on incident cancer cases and on a spatial sample of controls, drawn from a high-resolution population grid recognized as underestimating the resident population in urban centers. The spatial extensions of these areas with underestimated resident population were quantified with population reference data and used in this study as 'true risk areas'. Sensitivity and specificity analyses were conducted by spatial overlay of the 'true risk areas' and the significant (α=.05) p-contour lines obtained from the sRRF.

Results: We observed that the fixed bandwidth-based sRRF was distinguished by a conservative behavior in identifying these urban 'risk areas', that is, a reduced sensitivity but increased specificity due to oversmoothing as compared to the adaptive risk estimator. In contrast, the latter appeared more competitive through variance stabilization, resulting in a higher sensitivity, while the specificity was equal as compared to the fixed risk estimator. Halving the originally determined bandwidths led to a simultaneous improvement of sensitivity and specificity of the adaptive sRRF, while the specificity was reduced for the fixed estimator.

Conclusion: The fixed risk estimator contrasts with an oversmoothing tendency in urban areas, while overestimating the risk in rural areas. The use of an adaptive bandwidth regime attenuated this pattern, but led in general to a higher false positive rate, because, in our study design, the majority of true risk areas were located in urban areas. However, there is a strong need for further optimizing the bandwidth selection methods, especially for the adaptive sRRF.

Show MeSH

Related in: MedlinePlus

Overview of the used data sources. (a): Location of the study area in Germany. (b): Disaggregated, high resolution population grid using the EEA Fast Track Service Precursor on Land Monitoring dataset. (c): Relative errors of the disaggregated population estimates using reference data at census tract level (N = 1,983).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4389444&req=5

Fig1: Overview of the used data sources. (a): Location of the study area in Germany. (b): Disaggregated, high resolution population grid using the EEA Fast Track Service Precursor on Land Monitoring dataset. (c): Relative errors of the disaggregated population estimates using reference data at census tract level (N = 1,983).

Mentions: Data on individual incident cancer cases were obtained from the epidemiologic cancer registry of North-Rhine Westphalia [30]. The records for all 199,280 cancer cases arising between 1986 and 2005 in the Regierungsbezirk Münster (an administrative district in the Northwest of Germany with a total population 2.7 million) (Figure 1a) were geo-coded. The geo-coding was performed by the NRW state office for information and technology [31].Figure 1


Comparing adaptive and fixed bandwidth-based kernel density estimates in spatial cancer epidemiology.

Lemke D, Mattauch V, Heidinger O, Pebesma E, Hense HW - Int J Health Geogr (2015)

Overview of the used data sources. (a): Location of the study area in Germany. (b): Disaggregated, high resolution population grid using the EEA Fast Track Service Precursor on Land Monitoring dataset. (c): Relative errors of the disaggregated population estimates using reference data at census tract level (N = 1,983).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4389444&req=5

Fig1: Overview of the used data sources. (a): Location of the study area in Germany. (b): Disaggregated, high resolution population grid using the EEA Fast Track Service Precursor on Land Monitoring dataset. (c): Relative errors of the disaggregated population estimates using reference data at census tract level (N = 1,983).
Mentions: Data on individual incident cancer cases were obtained from the epidemiologic cancer registry of North-Rhine Westphalia [30]. The records for all 199,280 cancer cases arising between 1986 and 2005 in the Regierungsbezirk Münster (an administrative district in the Northwest of Germany with a total population 2.7 million) (Figure 1a) were geo-coded. The geo-coding was performed by the NRW state office for information and technology [31].Figure 1

Bottom Line: We observed that the fixed bandwidth-based sRRF was distinguished by a conservative behavior in identifying these urban 'risk areas', that is, a reduced sensitivity but increased specificity due to oversmoothing as compared to the adaptive risk estimator.In contrast, the latter appeared more competitive through variance stabilization, resulting in a higher sensitivity, while the specificity was equal as compared to the fixed risk estimator.Halving the originally determined bandwidths led to a simultaneous improvement of sensitivity and specificity of the adaptive sRRF, while the specificity was reduced for the fixed estimator.

View Article: PubMed Central - PubMed

Affiliation: Institute of Epidemiology and Social Medicine, Medical Faculty, Westfälische Wilhelms-Universität Münster, Münster, Germany. dorothea.lemke@uni-muenster.de.

ABSTRACT

Background: Monitoring spatial disease risk (e.g. identifying risk areas) is of great relevance in public health research, especially in cancer epidemiology. A common strategy uses case-control studies and estimates a spatial relative risk function (sRRF) via kernel density estimation (KDE). This study was set up to evaluate the sRRF estimation methods, comparing fixed with adaptive bandwidth-based KDE, and how they were able to detect 'risk areas' with case data from a population-based cancer registry.

Methods: The sRRF were estimated within a defined area, using locational information on incident cancer cases and on a spatial sample of controls, drawn from a high-resolution population grid recognized as underestimating the resident population in urban centers. The spatial extensions of these areas with underestimated resident population were quantified with population reference data and used in this study as 'true risk areas'. Sensitivity and specificity analyses were conducted by spatial overlay of the 'true risk areas' and the significant (α=.05) p-contour lines obtained from the sRRF.

Results: We observed that the fixed bandwidth-based sRRF was distinguished by a conservative behavior in identifying these urban 'risk areas', that is, a reduced sensitivity but increased specificity due to oversmoothing as compared to the adaptive risk estimator. In contrast, the latter appeared more competitive through variance stabilization, resulting in a higher sensitivity, while the specificity was equal as compared to the fixed risk estimator. Halving the originally determined bandwidths led to a simultaneous improvement of sensitivity and specificity of the adaptive sRRF, while the specificity was reduced for the fixed estimator.

Conclusion: The fixed risk estimator contrasts with an oversmoothing tendency in urban areas, while overestimating the risk in rural areas. The use of an adaptive bandwidth regime attenuated this pattern, but led in general to a higher false positive rate, because, in our study design, the majority of true risk areas were located in urban areas. However, there is a strong need for further optimizing the bandwidth selection methods, especially for the adaptive sRRF.

Show MeSH
Related in: MedlinePlus